Hi
I am trying to match data by gender, age range +/- 5 years and BMI +/- 3. With the code below it matches but it is including BMI values outside the +/- 3 range for some matches. Could some see what is wrong with this code? Thanks
clear
** creating matched data for age (+/- 5), gender(exact match) and BMI (-/+ 3)
******************************Data preparation task********************************************** ********
use "F:\OSA data\Latestcode\AllwithOSAdata_191121latest.dta"
** create cases subset
keep if id_casecntrl==1
keep if flag==1
rename id id_case
save "F:\OSA data\Latestcode\Cases1.dta", replace
** create controls subset
use "F:\OSA data\Latestcode\AllwithOSAdata_191121latest.dta"
keep if id_casecntrl==2
rename id id_cntl
save "F:\OSA data\Latestcode\Controls1.dta", replace
gen rand = runiform()
sort rand
drop rand
save "F:\OSA data\Latestcode\Controls2.dta", replace
*rename * *_cntl
*rename id_cntl id
*duplicates drop id, force
*save "C:\Users\venka\Desktop\NSWHealth\Venkatesha - Consults\1970 - Premala Sureshkumar\Controls3.dta", replace
******************************End of Data preparation task********************************************** ********
*Read the cases data file. Replace the file path of the data set appropraitely in the program
use "F:\OSA data\Latestcode\Cases1.dta"
* matching (exact) on Gender, within +/- 5 years for age
compress
rangejoin ageatvisit -5 5 using "F:\OSA data\Latestcode\Controls2.dta", by (gender)
order id_case id_cntl gender ageatvisit
drop *_U
gen rand = runiform()
sort rand
drop rand
*rename *_U *_cntl
*rename id id_cases
*sort id_cases
*drop if id_casecntrl_cntl==.
*use matched control only twice for each matched case(preserving 1:2 case : control ratio)
*bysort id_cases: keep if _n <= 2
*Check how many controls were found for every case
*bysort id_cases: gen byte numcontrols = _N if _n == 1
*tab numcontrols
*drop if numcontrols == 1
*drop numcontrols
** Matching on age and gender is complete.
*rename id_cntl id
*drop *_cntl
*gen rand = runiform()
*sort rand
*drop rand
* matching within +/- 3 units of BMI
rangejoin bmi -3 3 using "F:\OSA data\Latestcode\Controls2.dta", by (id_cntl)
drop if ageatvisit_U==.
drop if gender_U==""
order id_case id_cntl gender gender_U ageatvisit ageatvisit_U bmi bmi_U
drop *_U
*sort id_case
*use matched control only twice for each matched case(preserving 1:2 case : control ratio)
bysort id_case id_cntl: keep if _n == 1
bysort id_case: keep if _n <= 2
*Check how many controls were found for every case
bysort id_case: gen byte numcontrols = _N if _n ==1
tab numcontrols
drop if numcontrols == 1
drop numcontrols
rename * *_case
rename (id_case_case id_cntl_case) (id_case id_cntl)
*drop *_U
*rename * *_case
*rename (id_cases_case id_case) (id_case id)
save "F:\OSA data\Latestcode\MatchedData_08December\Matched_Age GenderBMI0.dta", replace
use "F:\OSA data\Latestcode\Controls2.dta"
rename * *_cntl
rename id_cntl_cntl id_cntl
duplicates drop id_cntl, force
save "F:\OSA data\Latestcode\Controls3.dta", replace
use "F:\OSA data\Latestcode\MatchedData_08December\Matched_Age GenderBMI0.dta"
merge m:m id_cntl using "F:\OSA data\Latestcode\Controls3.dta"
order id_case id_cntl gender_case gender_cntl ageatvisit_case ageatvisit_cntl bmi_case bmi_cntl
drop if id_case==""
drop _merge
bysort id_case id_cntl: keep if _n == 1
*Check how many controls were found for every case
bysort id_case: gen byte numcontrols = _N if _n ==1
tab numcontrols
drop if numcontrols == 1
drop numcontrols
sort id_case
order id_case id_cntl gender_case gender_cntl ageatvisit_case ageatvisit_cntl bmi_case bmi_cntl
save "F:\OSA data\Latestcode\MatchedData_08December\Matched_Age GenderBMI1.dta", replace
** Matching on age,gender and BMI is complete.
Related Posts with Case control matching with age, gender and BMI
getting estimated y vale for regressions on subsamples of a panel datasetDear Statalist, I need to have estimated value of an independent variable for a panel data set for …
Foreach Loop for CAPM regressionHey there, i am trying to do CAPM regressions with daily stock returns for every company over a per…
How to create groups with an equal number of observations?Hi, I have a enormous data set and I would like to create ten groups (country) with an equal number…
Abadi's semi-parametric DiD Estimator - Sample characteristicsHi Statalist, I am currently working on dataset using a DiD approach to test the effect of a treatm…
How to get the value of the first occurence on the first rowI have a data in this format below. I want to generate another column (X) that gave me the value of …
Subscribe to:
Post Comments (Atom)
0 Response to Case control matching with age, gender and BMI
Post a Comment