I am trying to select 20 controls from a community database for each case and do conditional logistic regression. But results are not reproducible and are different each time when I run the do file. Can anyone kindly give me some advice? The do files is listed below.
***select 20 controls
use "\\UOFA\USERS$\users3\Data analysis\Ado_2020\coverage data.dta", clear
drop SEIFA sex
encode GENDER, gen (gender)
label define gender 1 "Female" 2 "Male", replace
replace gender=. if gender==3
tab gender, missing
gen SEIFA=.
replace SEIFA=1 if Decile<=4
replace SEIFA=2 if Decile<=7 & SEIFA>4
replace SEIFA=3 if Decile<=10 & SEIFA>7
label define SEIFA 1 "Low" 2 "Mid" 3 "High"
label value SEIFA SEIFA
sort DateOfBirth
gen a_id=_n
keep a_id SEIFA gender abor DateOfBirth SERVICEDATE1 SERVICEDATE2 SERVICEDATE3 SERVICEDATE4 SERVICEDATE5
sort a_id DateOfBirth
save "control.dta", replace
use "\\UOFA\USERS$\users3\Data analysis\Ado_2020\dataset_wide_cleaned _16Oct2020.dta", clear
drop if DiseaseType1=="C"
rangejoin DateOfBirth -28 28 using "control.dta"
tab case_id
sort case_id notificationDate1 DateOfBirth
set seed 512695
gen double shuffle = runiform()
by case_id (shuffle), sort: keep if _n <21
drop shuffle
**create control dataset
keep case_id a_id abor_U gender_U SEIFA_U DateOfBirth_U SERVICEDATE1 SERVICEDATE2 SERVICEDATE3 SERVICEDATE4 SERVICEDATE5
gen case=0
rename gender_U gender
rename abor_U abor
rename SEIFA_U SEIFA
rename DateOfBirth_U DateOfBirth
save "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls_20Oct2020.d ta", replace
**create case dataset
use "\\UOFA\USERS$\Data analysis\Ado_2020\dataset_wide_cleaned _16Oct2020.dta", clear
drop if DiseaseType1=="C"
keep case case_id Patientid dose gender abor ageyrs SEIFA DateOfBirth notificationDate1
**combine case and control datasets
append using "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls_20Oct2020.d ta"
sort case_id notificationDate1
qui bysort case_id (notificationDate1): replace notificationDate1= notificationDate1[1]
gen notificationDate1_month=mofd(notificationDate1)
format notificationDate1_month %tm
gen servicedate1_month=mofd(SERVICEDATE1)
format servicedate1_month %tm
gen vacc_dose1months=notificationDate1_month-servicedate1_month
gen dose1=1 if case==0
replace dose1=0 if (vacc_dose1months<6 | vacc_dose1months==.) & case==0
gen servicedate2_month=mofd(SERVICEDATE2)
format servicedate2_month %tm
gen vacc_dose2months=notificationDate1_month-servicedate2_month
gen dose2=1 if case==0
replace dose2=0 if (vacc_dose2months<6 | vacc_dose2months==.) & case==0
replace dose=0 if dose1==0 & case==0
replace dose=1 if dose1==1 & dose2==0 & case==0
replace dose=2 if dose1==1 & dose2==1 & case==0
replace ageyrs=int((notificationDate1-DateOfBirth)/365) if ageyrs==.
save "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls for cases_20Oct2020.dta", replace
** run conditional logistic regression
clogit case i.dose i.abor i.gender i.SEIFA, group( case_id ) or difficult
Related Posts with Results are not reproducible
How to estimate margins after bioprobit?Hello everyone, I am conducting a study to analyze accident data. I have two outcome variables, nam…
Violations of the proportional hazards assumptionHello everyone Perform a COX proportional hazard regression model. However, the final multivariate …
Exporting Regression Results with 6 decimal pointsHello, I can export regression results in stata, but I was wondering how to export regression result…
How can I find out how Stata is calculating covariance matrices exactly(!)I'm trying to translate Stata results into R and with the existing methods, I can only recreate stan…
Heteroskedasticity and MulticollinearityWhen I use the command (estat hettest) after doing the panel regression with re and fe then hausman …
Subscribe to:
Post Comments (Atom)
0 Response to Results are not reproducible
Post a Comment