************************************** ********* Analysis 2 of PCTC ********* ******** Vu Dien - 05 Aug 2013 ******* ************************************** *========================================================================== ***** Independent variables ********** *-------------------------------------------------------------------------- * Variables Description Values *-------------------------------------------------------------------------- * shs secondhand smoking 1 Yes 0 No * cigs number of cigarettes smoked by fathers continuous *========================================================================== ***** Dependent variables ************ *-------------------------------------------------------------------------- * Variables Description Values *-------------------------------------------------------------------------- * time time to eruption of the 1st tooth months * erupted an erupted tooth 1 Yes 0 No *========================================================================== ***** Potential confounding factors ** *-------------------------------------------------------------------------- * Variables Description Values *-------------------------------------------------------------------------- * msmoke mother's smoking status 1 Yes 0 No * mcigs number of cigarettes smoked by mothers continuous * mage mother's age continuous * medu mother's highest education level 1-->6 * alc mother's alcohol drinking status 1 Yes 0 No * income family's income continuous * sex child's gender 1 male * 2 female * bw birth weight continuous * ga gestational age at labor continuous * site study site 1 North * 2 Northeast * 3 Central * 4 South * 5 Bangkok *========================================================================== *========================================================================== *Step 1: Find the code of variables in CRF files *========================================================================== *-------------------------------------------------------------------------------------- * variable CRF file name in that file create new name *-------------------------------------------------------------------------------------- * shs ANT_B02B_ENG b02b_a HB22 (current) yes * cigs ANT_B02B_ENG b02b_a HB22A yes * msmoke ANT_B02A_ENG b02a_a B22 (current) yes * mcigs ANT_B02A_ENG b02a_a B22A yes * erupted ANT_C08_EN (6 months) c08_1_a c85 (at 6 months) yes * ANT_D03_EN (12months) d03_a d31 (at 12 months) yes * time ANT_C08_EN (6 months) c08_1_a c85 (at 6 months) yes * ANT_D03_EN (12months) d03_a d31 (at 12 months) yes * mage ANT_K02_ENG k02_a k21e1 yes * medu ANT_K02_ENG aj_ladda_23apr k21ig yes * alc ANT_B02A_ENG b02a_a B23 (yes/no) yes * income aj_ladda_23apr income no * sex aj_ladda_23apr sex no * bw ANT_B05_ENG b05_a B53B yes * ga ANT_B04_ENG b04_a B42 yes * site ANT_B02B_ENG b02b_a idmot (the 1st char) yes *-------------------------------------------------------------------------------------- *================================================================== *Step 2: Convert the files which contain those variables into Stata *================================================================== *Using command INSHEET to convert .txt into .dta insheet using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Dataset\PCTC\PCTC Data\b02b_a.txt",clear /*4256 obs*/ save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\b02b_a.dta" insheet using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Dataset\PCTC\PCTC Data\b02a_a.txt",clear /*4421 obs*/ save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\b02a_a.dta" insheet using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Dataset\PCTC\PCTC Data\c08_1_a.txt",clear /*4370 obs*/ save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\c08_1_a.dta" insheet using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Dataset\PCTC\PCTC Data\d03_a.txt",clear /*4116 obs*/ save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\d03_a.dta" insheet using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Dataset\PCTC\PCTC Data\k02_a.txt",clear /*4490 obs*/ save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\k02_a.dta" insheet using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Dataset\PCTC\PCTC Data\Aj_ladda_23apr.txt", clear /*4245 obs*/ save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\Aj_ladda_23apr.dta" insheet using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Dataset\PCTC\PCTC Data\b05_a.txt",clear /*4379 obs*/ save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\b05_a.dta" insheet using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Dataset\PCTC\PCTC Data\b04_a.txt",clear /*4355 obs*/ save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\b04_a.dta" *---------------------------------------------------------------------------------------------- *============================================================= *Step 3: Drop the variables which will not be used to analyze. * In other words, keep only interested variables *============================================================= *Using command KEEP to keep only the variables of interest use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\b02b_a.dta",clear keep idmot hb21 hb22 hb22a save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\b02b_a.dta" use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\b02a_a.dta",clear keep idmot b22 b22a b23 save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\b02a_a.dta" use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\c08_1_a.dta",clear keep idchd c85 save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\c08_1_a.dta" use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\d03_a.dta",clear keep idchd d31 save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\d03_a.dta" use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\k02_a.dta",clear keep idmot k21e1 save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\k02_a.dta" use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\Aj_ladda_23apr.dta",clear keep idchd idmot sex k21ig income save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\Aj_ladda_23apr.dta" use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\b05_a.dta",clear keep idchd b53b save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\b05_a.dta" use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC converting\b04_a.dta",clear keep idmot b42 save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\b04_a.dta" *------------------------------------------------------------------------------------------- *================================================================================== *Step 5: Merge files altogether, using file "Aj_ladda_23apr.dta" as the master file *================================================================================== *Merge the master file with 4 files which have IDMOT only, using option unmatched(master) to keep IDCHD in master file if IDMOT missing in any of 4 files *Using command JOINBY with option unmatch(both) to merge files, save as finaldataset.dta use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\Aj_ladda_23apr.dta", clear joinby idmot using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\b02a_a.dta", unmatched(master) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset.dta" joinby idmot using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\b02b_a.dta", unmatched(master) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset.dta", replace joinby idmot using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\k02_a.dta", unmatched(master) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset.dta", replace joinby idmot using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\b04_a.dta", unmatched(master) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset.dta", replace *Now, there are 4245 obs in finaldataset.dta *Next, open the Master file which is c08_1_a.dta (4370 obs), then merge master file with finaldataset.dta, using IDCHD as key variable, unmatched(master) use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\c08_1_a.dta", clear joinby idchd using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset.dta", unmatched(master) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1.dta" joinby idchd using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\b05_a.dta", unmatched(master) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1.dta", replace *Now there is only one file left to be merged. That file is "d03_a.dta", which contains variable D31 (time of tooth eruption interviewed at 12 months) *We will merge that file with the file finaldataset1.dta *Case 1: unmatched(both) ==> 4406 obs use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1.dta", clear joinby idchd using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\d03_a.dta", unmatched(both) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1_both.dta" *Case 2: unmatched(master) ==> 4370 obs use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1.dta", clear joinby idchd using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\d03_a.dta", unmatched(master) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1_master.dta" *Case 3: unmatched(using) ==> 4116 obs use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1.dta", clear joinby idchd using "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC selecting vars\d03_a.dta", unmatched(using) drop _merge save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1_using.dta" * I don't know which situation should be used. To my opinion, I suggest case 3 because if there is any missing values in finaldataset1.dta, we will keep the data of variable D31 in d03_a.dta. * What would you think? I need this final dataset, so I can continue to do the remaining analysis. *--------------------------------------------------------------------------------------------------------------- *=========================== *Step 6: Creat new variables *=========================== use "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1_using.dta", clear save "D:\Hoc Hanh\KKU\YEAR II\TAKASILA\Assignment\PCTC joinby\finaldataset1_using_analyze.dta" *Create var TIME: the month when 1st tooth erupted gen time = c85 replace time = "" if regexm(time, "^[0-9]") == 0 /*All data started with non-numeric was now missing value*/ replace time = "" if time == "0" /*Force changes due to impossible value*/ destring time, replace replace time = d31 if time == . & d31!=-9 /*Change to the month of eruption assessed by 12months (var d31)*/ drop c85 d31 *Create variable ERUPTED: tooth eruption status (1 yes, 0 no) gen erupted = 0 replace erupted = 1 if time >=1 & time!=. *Create var SHS: secondhand smoking (yes/no) recode hb21 (-9=.) recode hb22 (-9=.) (-8=.) recode hb22a (-9=.) (-8=.) gen shs=. replace shs = 1 if hb22a != 0 & hb22a !=. replace shs = 1 if hb21 == 1 replace shs = 1 if hb22 == 1 replace shs = 0 if hb21 == 2 replace shs = 0 if hb22a == 0 & hb21 != 1 la def noyes 0 "No" 1 "Yes" /*Create label*/ la val shs noyes *Create var CIGS: number of cigarettes smoked by fathers (continuous) gen cigs=hb22a replace cigs = 0 if shs == 0 /* *Create var MSMOKE: mother's smoking status (yes/no) gen msmoke=b22 recode msmoke (-9=.) (-8=.) la val msmoke noyes *Create var MCIGS: number of cigarettes smoked by mothers (continuous) gen mcigs=b22a recode mcigs (-9=.) (-8=.) */ *Create var MAGE, MEDU, ALC: age, education level, alcohol status of mothers gen mage=k21e1 recode mage (-9=.) (0=.) gen medu=k21ig recode medu (-9=.) la def l_medu 1 "Illiterate" 2 "Primary school" 3 "High school" 4 "Vocational training" 5 "University and higher" 6 "Other" la val medu l_medu gen alc=b23 recode alc (-9=.) (2=0) la val alc noyes /* *Create intervals and new categorical variables for MAGE xtile mage3t = mage, nq(3) la def lmage3t 1 "13-24" 2 "25-30" 3 "31-48" /*Create label*/ la val mage3t lmage3t tabstat mage, stat (n min max) by(mage3t) */ *Create var BW, GA: birthweight and gestational age of infants gen bw=b53b gen ga=b42 *Create new categorical variable of Birthweight gen bwgroup=. replace bwgroup=0 if bw>=2500 replace bwgroup=1 if bw<2500 la def l_bwgroup 0 "Normal BW" 1 "Low BW" la val bwgroup l_bwgroup *Create new categorical variable of Gestational Age gen gagroup=. replace gagroup=0 if ga>=37 replace gagroup=1 if ga<37 la def l_termbirth 0 "Term birth" 1 "Preterm birth" la val gagroup l_termbirth *Create var SITE: the study sites gen site=trunc(idchd/10000000) la def l_site 1 "North" 2 "Northeast" 3 "Central" 4 "South" 5 "Bangkok" la val site l_site *Recode variable SEX recode sex (2=0) la def l_sex 0 "female" 1 "male" la val sex l_sex *================================= *Step 7: Start to analyze the data *================================= *------------------------------------------------------------------------------------------------------ * Table 1: Demographic characteristics * For mothers /* tab mage3t site, col bysort site: tabstat mage, stats(n mean sd min max) */ tab medu site, col miss tab alc site, col miss /* tab income3t site, col miss */ * For infants tab sex site, col miss tab bwgroup site, col miss tab gagroup site, col miss *------------------------------------------------------------------------------------------------------ *------------------------------------------------------------------------------------------------------ * Table 2: Percentage of SHS in pregnant women among 5 sites tab shs site, col miss *------------------------------------------------------------------------------------------------------ *------------------------------------------------------------------------------------------------------ * Table 3: Crude hazard ratios (HR) of tooth eruption for each explanatory factor * Event = erupted stset time, failure(erupted) local listvar "shs mage i.medu alc sex bwgroup gagroup" foreach var of local listvar { stcox `var', strata(site) } *------------------------------------------------------------------------------------------------------ /* *------------------------------------------------------------------------------------------------------ * Table 4: Adjusted HR of tooth eruption for each explanatory factor *------------------------------------------------------------------------------------- * Step 1: Stratified analysis *Section 3.1 Effect of MEDU on the association between SHS and ERUPTED cc erupted shs, by(medu) /*Test of homogeneity (M-H) p */ *Section 3.2 Effect of ALC on the association between SHS and ERUPTED cc erupted shs, by(alc) /*Test of homogeneity (M-H) p */ *Section 3.3 Effect of SEX on the association between SHS and ERUPTED cc erupted shs, by(sex) /*Test of homogeneity (M-H) p */ *Section 3.4 Effect of BWGROUP on the association between SHS and ERUPTED cc erupted shs, by(bwgroup) /*Test of homogeneity (M-H) p */ *Section 3.5 Effect of GAGROUP on the association between SHS and ERUPTED cc erupted shs, by(gagroup) /*Test of homogeneity (M-H) p */ */ *------------------------------------------------------------------------------------- * Step 2: Multivariable analysis : Cox regression * Create interaction variables gen shsbw = shs * bwgroup gen shsga = shs * gagroup gen shssex = shs * sex * The initial model – the full model stcox shs sex bwgroup gagroup shssex shsbw shsga, strata(site) est store full * Remove shssex (p=0.679) stcox shs sex bwgroup gagroup shsbw shsga, strata(site) lrtest full, force * p = 0.679, so we can remove this variable shssex est store model1 * Remove shsbw (p=0.491) stcox shs sex bwgroup gagroup shsga, strata(site) lrtest model1, force * p = 0.493, so we can remove this variable shsbw est store model2 * Remove shsga (p=0.310) stcox shs sex bwgroup gagroup, strata(site) lrtest model2, force * p = 0.316, so we can remove this variable shsga est store model3 * Remove gagroup (p=0.475) stcox shs sex bwgroup, strata(site) lrtest model3, force * p = 0.473, so we can remove this variable gagroup * SO THIS IS THE FINAL MODEL stcox shs sex bwgroup, strata(site) stcox shs sex bwgroup *------------------------------------------------------------------------------------------------------ stset time, failure(erupted) stsum stci /*Median time to tooth eruption*/ graph box time, over(site) /*Box plot to see median time*/ * To see the equality of the survival function between SHS and non SHS group * H0: S[SHS](t) = S[non-SHS](t) (Survival is the same) * H1: S[SHS](t) # S[non-SHS](t) (Survival is not the same) sts test shs, strata(site) *------------------------------------------------------------------------------------------------------ *------------------------------------------------------------------------------------------------------ * Figure 3: Difference in the probability of erupted tooth between SHS group and non-SHS group * Event = erupted stset time, failure(erupted) sts graph, by(shs) ci fail *------------------------------------------------------------------------------------------------------ *------------------------------------------------------------------------------------------------------ * Figure 4: Difference in the probability of erupted tooth between female and male sts graph, by(sex) ci fail *------------------------------------------------------------------------------------------------------ *The End *------------------------------------------------------------------------------------------------------ * Thinking about imputation for missing data mi set mlong mi register imputed mage mi impute regress mage shs cigs bw, add(200) rseed(10394) mi estimate: stcox shs age * Table 6: Crude effect of each factor on DTE * Multiple logistic regression *====================================================================================================== ******************************************************************************************************* * Research question is that "Does SHS affect DTE?" * SHS is the "risk of interest" ******************************************************************************************************* *====================================================================================================== *==================================================================================== ***** Dependent and independent variables *------------------------------------------------------------------------------------ * Variables Description Values *------------------------------------------------------------------------------------ * delay delayed first tooth eruption 1 Yes 0 No * shs secondhand smoking 1 Yes 0 No *==================================================================================== ***** Potential confounding factors *------------------------------------------------------------------------------------ * Variables Description Values *------------------------------------------------------------------------------------ * msmoke mother's smoking status 1 Yes 0 No * mage mother's age 1 13-24 * 2 25-30 * 3 31-48 * medu mother's highest education level 1-->6 * alc mother's alcohol drinking status 1 Yes 0 No * income family's income 1 Low <=66k * 2 Medium 66k-158k * 3 High >=158k * sex child's gender 1 male * 0 female * bwgroup birth weight 1 Low BW * 0 Normal BW * gagroup gestational age at labor 1 < 37wks * 0 >=37 wks * site study site 1 North * 2 Northeast * 3 Central * 4 South * 5 Bangkok *===================================================================================== * Step 1: Exploring the data and univariate analysis list delay shs msmoke mage medu alc income sex bw ga site tab delay ci delay *------------------------------------------------------------------------------------- * Step 2: Bivariate (crude) analysis * Section 2.1 Crude effect of SHS on DELAY cs delay shs, or *Section 2.2 Crude effect of MSMOKE on DELAY cs delay msmoke, or /*p =0.67*/ *Section 2.3 Crude effect of MAGE3T on DELAY tab mage3t delay, row chi2 exact /*p =0.068*/ csi 248 305 1159 1325, or /*to see OR of group 2 compared to group 1, p=0.44*/ csi 285 305 1070 1325, or /*to see OR of group 3 compared to group 1, p=0.11*/ logistic delay mage *Section 2.4 Crude effect of MEDU on DELAY tab medu delay, row chi2 exact /*p=0.01*/ replace medu=5 if medu==6 /*collapsed two categories because of small number of category 6*/ tab medu delay, row chi2 exact /*p=<0.001*/ csi 443 37 1694 190, or csi 239 37 989 190, or csi 52 37 303 190, or csi 68 37 390 190, or *or can use this command logistic delay i.medu *Section 2.5 Crude effect of ALC on DELAY cs delay alc, or /*p =0.79*/ *Section 2.6 Crude effect of INCOME3T on DELAY tab income3t delay, row chi2 exact /*p<0.001*/ csi 254 328 1188 1116, or csi 249 328 1191 1116, or logistic delay income *Section 2.7 Crude effect of SEX on DELAY cs delay sex, or /*p<0.001*/ *Section 2.8 Crude effect of BWGROUP on DELAY cs delay bwgroup, or /*p<0.001*/ *Section 2.9 Crude effect of GAGROUP on DELAY cs delay gagroup, or /*p=0.02*/ *Section 2.10 Crude effect of SITE on DELAY tab site delay, row chi2 /*p<0.001*/ csi 319 125 829 697, or csi 154 125 760 697, or csi 139 125 695 697, or csi 103 125 582 697, or *or can use this command logistic delay i.site *------------------------------------------------------------------------------------- * Step 3: Stratified analysis *Section 3.1 Effect of BWGROUP on the association between SHS and DELAY cc delay shs, by(bwgroup) /*Test of homogeneity (M-H) p = 0.13*/ *Section 3.2 Effect of GAGROUP on the association between SHS and DELAY cc delay shs, by(gagroup) /*Test of homogeneity (M-H) p < 0.001*/ *Section 3.3 Effect of SITE on the association between SHS and DELAY cc delay shs, by(site) /*Test of homogeneity (M-H) p = 0.023*/ *Section 3.4 Effect of SEX on the association between SHS and DELAY cc delay shs, by(sex) /*Test of homogeneity (M-H) p < 0.001*/ *Section 3.5 Effect of INCOME on the association between SHS and DELAY cc delay shs, by(income3t) /*Test of homogeneity (M-H) p < 0.725*/ *Section 3.6 Effect of MAGE on the association between SHS and DELAY cc delay shs, by(mage3t) /*Test of homogeneity (M-H) p < 0.0134*/ *Section 3.7 Effect of MEDU on the association between SHS and DELAY cc delay shs, by(medu) /*Test of homogeneity (M-H) p < 0.074*/ * We select SHS*SEX, SHS*BWGROUP, and SHS*GAGROUP *------------------------------------------------------------------------------------- * Step 4: Multivariable analysis : Logistic regression * Create interaction variables gen s_sex = shs * sex gen s_mage = shs * mage gen s_gagr = shs * gagroup * Section 4.1. The initial model – the full model xi: logistic delay shs mage i.medu sex bwgroup gagroup i.site s_sex s_mage s_gagr est store full * Section 4.2. Model without s_mage as s_mage has highest p value of 0.016 xi: logistic delay shs mage i.medu sex bwgroup gagroup i.site s_sex s_gagr lrtest full, force /* p=0.015, so need to keep s_mage in the model*/ * Section 4.3. Model without s_sex as s_mage has higher ordered term xi: logistic delay shs mage i.medu sex bwgroup gagroup i.site s_mage s_gagr lrtest full, force /* p<0.001, so need to keep s_sex in the model*/ * Section 4.4. Model without s_gagr as s_gagr has higher ordered term xi: logistic delay shs mage i.medu sex bwgroup gagroup i.site s_sex s_mage lrtest full, force /* p<0.001, so need to keep s_gagr in the model*/ * Considering 3 interaction terms, we decided to keep only s_gagr ********************************************************************* * Now, we start to run again from step 1 with the full model * Full model xi: logistic delay shs mage i.medu sex bwgroup gagroup i.site s_gagr est store full * Remove medu xi: logistic delay shs mage sex bwgroup gagroup i.site s_gagr lrtest full, force * p = 0.027, so we can't remove this variable medu * Remove mage xi: logistic delay shs i.medu sex bwgroup gagroup i.site s_gagr lrtest full, force * p < 0.001, so we can't remove this variable mage * The final model is the 1st model (full model) xi: logistic delay shs mage i.medu sex bwgroup gagroup i.site s_gagr *------------------------------------------------------------------------------------------ * I tried this one: * Backward Stepwise: xi: sw logistic delay shs mage i.medu sex bwgroup gagroup i.site s_mage, pr(0.2) est store full * Step 5: Assessing model adequacy: test for goodness of fit of the model estat gof /*goodness-of-fit test*/ * Step 6: Obtaining measure of associations from the model *------------------------------------------------------------------------------------------------------ *------------------------------------------------------------------------------------------------------ * Table 7: Model of Association between the number of cigarettes smoked by the fathers and the time of first tooth eruption corr cigs time * then I see no correlation between cigs and time regress time cigs mage bw ga income alc * Draw a regression line with 95% CI twoway lfitci time cigs, stdf || scatter time cigs /*stdf: SE for the forecast*/ *------------------------------------------------------------------------------------------------------