I am trying to select 20 controls from a community database for each case and do conditional logistic regression. But results are not reproducible and are different each time when I run the do file. Can anyone kindly give me some advice? The do files is listed below.
***select 20 controls
use "\\UOFA\USERS$\users3\Data analysis\Ado_2020\coverage data.dta", clear
drop SEIFA sex
encode GENDER, gen (gender)
label define gender 1 "Female" 2 "Male", replace
replace gender=. if gender==3
tab gender, missing
gen SEIFA=.
replace SEIFA=1 if Decile<=4
replace SEIFA=2 if Decile<=7 & SEIFA>4
replace SEIFA=3 if Decile<=10 & SEIFA>7
label define SEIFA 1 "Low" 2 "Mid" 3 "High"
label value SEIFA SEIFA
sort DateOfBirth
gen a_id=_n
keep a_id SEIFA gender abor DateOfBirth SERVICEDATE1 SERVICEDATE2 SERVICEDATE3 SERVICEDATE4 SERVICEDATE5
sort a_id DateOfBirth
save "control.dta", replace
use "\\UOFA\USERS$\users3\Data analysis\Ado_2020\dataset_wide_cleaned _16Oct2020.dta", clear
drop if DiseaseType1=="C"
rangejoin DateOfBirth -28 28 using "control.dta"
tab case_id
sort case_id notificationDate1 DateOfBirth
set seed 512695
gen double shuffle = runiform()
by case_id (shuffle), sort: keep if _n <21
drop shuffle
**create control dataset
keep case_id a_id abor_U gender_U SEIFA_U DateOfBirth_U SERVICEDATE1 SERVICEDATE2 SERVICEDATE3 SERVICEDATE4 SERVICEDATE5
gen case=0
rename gender_U gender
rename abor_U abor
rename SEIFA_U SEIFA
rename DateOfBirth_U DateOfBirth
save "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls_20Oct2020.d ta", replace
**create case dataset
use "\\UOFA\USERS$\Data analysis\Ado_2020\dataset_wide_cleaned _16Oct2020.dta", clear
drop if DiseaseType1=="C"
keep case case_id Patientid dose gender abor ageyrs SEIFA DateOfBirth notificationDate1
**combine case and control datasets
append using "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls_20Oct2020.d ta"
sort case_id notificationDate1
qui bysort case_id (notificationDate1): replace notificationDate1= notificationDate1[1]
gen notificationDate1_month=mofd(notificationDate1)
format notificationDate1_month %tm
gen servicedate1_month=mofd(SERVICEDATE1)
format servicedate1_month %tm
gen vacc_dose1months=notificationDate1_month-servicedate1_month
gen dose1=1 if case==0
replace dose1=0 if (vacc_dose1months<6 | vacc_dose1months==.) & case==0
gen servicedate2_month=mofd(SERVICEDATE2)
format servicedate2_month %tm
gen vacc_dose2months=notificationDate1_month-servicedate2_month
gen dose2=1 if case==0
replace dose2=0 if (vacc_dose2months<6 | vacc_dose2months==.) & case==0
replace dose=0 if dose1==0 & case==0
replace dose=1 if dose1==1 & dose2==0 & case==0
replace dose=2 if dose1==1 & dose2==1 & case==0
replace ageyrs=int((notificationDate1-DateOfBirth)/365) if ageyrs==.
save "U:\Desktop\Data analysis\Ado_2020\casecontrol_controls for cases_20Oct2020.dta", replace
** run conditional logistic regression
clogit case i.dose i.abor i.gender i.SEIFA, group( case_id ) or difficult
Related Posts with Results are not reproducible
Issues with loop used to drop observationsHello, everyone, The data that I'm using has been delivered to us with duplicated patient IDs. Thes…
The workflow of data cleanHi! Does anyone know how to make the data cleanning process automazation? Any time I got some new or…
Ivreg2 -Endogeneity and Hansen J statisticsHello all, I'm currently working on a cross-section with iveg2 (gmm2s to address the heteroskedasti…
how to use tempvar more efficiently?If I run syntax with tempvar in a block, Stata will generate variables such as __0000001 __0000002, …
Issue with loop saving graphs with more than one wordHi all, I have set up the following loop: Code: foreach name in "Botswana" "Burkino Faso" "Camero…
Subscribe to:
Post Comments (Atom)
0 Response to Results are not reproducible
Post a Comment