********************************* *****Data analysis ************** *****Dien Hoa Anh Vu ************ *****10/07/2013 ***************** ********************************* *Open 'master file' use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\mothers.dta", clear *Merge with 'using file' joinby idmot using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\fathers.dta" *Drop variables not in interested, or keep only interested variables, e.g. id, idmot, b20a, b20b keep id idmot b20a b20b *Save this file, then continue to merge with other files *Keep only variables that will be used to analize *Save the last dataset *Start to analize the data *Demographic desrciption, using command TAB to see the frequency and percentage tab b23 *Use command TAB X Y, ROW to get the percentage of Y among each subgroup of X tab x y, row *To see the OR of DTE among 5 sites logistic DTE SITES *Linear regression to predict the time of tooth eruption based on number of cigarettes regress timeerupt cig *Create variable SHS gen SHS= replace SHS=0 if replace SHS=1 if *To see the HR of DTE between 2 groups of SHS and non-SHS stcox SHS DTE *Draw survival curve stcurve, hazard *More commands will be added once the dataset is available.