I am trying to select 20 controls from a community database for each case and do conditional logistic regression. But results are not reproducible and are different each time when I run the do file. Can anyone kindly give me some advice? The do files is listed below.
***select 20 controls
use "\\UOFA\USERS$\users3\Data analysis\Ado_2020\coverage data.dta", clear
drop SEIFA sex
encode GENDER, gen (gender)
label define gender 1 "Female" 2 "Male", replace
replace gender=. if gender==3
tab gender, missing
gen SEIFA=.
replace SEIFA=1 if Decile<=4
replace SEIFA=2 if Decile<=7 & SEIFA>4
replace SEIFA=3 if Decile<=10 & SEIFA>7
label define SEIFA 1 "Low" 2 "Mid" 3 "High"
label value SEIFA SEIFA
sort DateOfBirth
gen a_id=_n
keep a_id SEIFA gender abor DateOfBirth SERVICEDATE1 SERVICEDATE2 SERVICEDATE3 SERVICEDATE4 SERVICEDATE5
sort a_id DateOfBirth
save "control.dta", replace
use "\\UOFA\USERS$\users3\Data analysis\Ado_2020\dataset_wide_cleaned _16Oct2020.dta", clear
drop if DiseaseType1=="C"
rangejoin DateOfBirth -28 28 using "control.dta"
tab case_id
sort case_id notificationDate1 DateOfBirth
set seed 512695
gen double shuffle = runiform()
by case_id (shuffle), sort: keep if _n <21
drop shuffle
**create control dataset
keep case_id a_id abor_U gender_U SEIFA_U DateOfBirth_U SERVICEDATE1 SERVICEDATE2 SERVICEDATE3 SERVICEDATE4 SERVICEDATE5
gen case=0
rename gender_U gender
rename abor_U abor
rename SEIFA_U SEIFA
rename DateOfBirth_U DateOfBirth
save "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls_20Oct2020.d ta", replace
**create case dataset
use "\\UOFA\USERS$\Data analysis\Ado_2020\dataset_wide_cleaned _16Oct2020.dta", clear
drop if DiseaseType1=="C"
keep case case_id Patientid dose gender abor ageyrs SEIFA DateOfBirth notificationDate1
**combine case and control datasets
append using "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls_20Oct2020.d ta"
sort case_id notificationDate1
qui bysort case_id (notificationDate1): replace notificationDate1= notificationDate1[1]
gen notificationDate1_month=mofd(notificationDate1)
format notificationDate1_month %tm
gen servicedate1_month=mofd(SERVICEDATE1)
format servicedate1_month %tm
gen vacc_dose1months=notificationDate1_month-servicedate1_month
gen dose1=1 if case==0
replace dose1=0 if (vacc_dose1months<6 | vacc_dose1months==.) & case==0
gen servicedate2_month=mofd(SERVICEDATE2)
format servicedate2_month %tm
gen vacc_dose2months=notificationDate1_month-servicedate2_month
gen dose2=1 if case==0
replace dose2=0 if (vacc_dose2months<6 | vacc_dose2months==.) & case==0
replace dose=0 if dose1==0 & case==0
replace dose=1 if dose1==1 & dose2==0 & case==0
replace dose=2 if dose1==1 & dose2==1 & case==0
replace ageyrs=int((notificationDate1-DateOfBirth)/365) if ageyrs==.
save "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls for cases_20Oct2020.dta", replace
** run conditional logistic regression
clogit case i.dose i.abor i.gender i.SEIFA, group( case_id ) or difficult
Related Posts with Results are not reproducible
Continuous threshold model (Kink threshold model)Hey all, What's the difference between the threshold model that can divide the sample into two regi…
Panel data regressionHello, Greeting to all! Please I am trying to test the consequences of exchange rate regime and capi…
Error when using generate to copy variableHi, I am failing at a surprisingly simple task. In a csv file I have a date variable (mess_datum) i…
Assign a dummy 0/1 to each county based on a scoreHello everyone, I have a dataset with US counties where for each county in each quarter I have the t…
Creating a study site map in stataHello Statalisters, I would like to find out if it is possible to generate a country map showing a …
Subscribe to:
Post Comments (Atom)
0 Response to Results are not reproducible
Post a Comment