Hello,
I have a case control study with 59 cases matched 1:3 with controls on birth year. The match was completed adapting code from threads in this forum (e.g. "Matching cases and controls based age and gender") code below. The data is now in wide format where there are a total of 177 observations with three entries per case with its corresponding control. There are a total of 300 variables in the wide format (half for the case (e.g. m_age, i_bw, etc), half for the control (denoted by _ctrl; e.g. m_age_ctrl, i_bw_ctrl, etc)
The wide format is great for mcc analysis but I am also interested in using clogit and I cannot figure out how to reshape the data into a long format to allow this. I have found examples of how to reshape from long to wide but not the other way around.
As an example, I have tried reshape long m_age* i_bw*, i(record_id) j(casecon) but receive the following error message:
no xij variables found
You typed something like reshape wide a b, i(i) j(j).
reshape looked for existing variables named a# and b# but could not find any. Remember this picture:
long wide
+---------------+ +------------------+
| i j a b | | i a1 a2 b1 b2 |
|---------------| <--- reshape ---> |------------------|
| 1 1 1 2 | | 1 1 3 2 4 |
| 1 2 3 4 | | 2 5 7 6 8 |
| 2 1 5 6 | +------------------+
| 2 2 7 8 |
+---------------+
long to wide: reshape wide a b, i(i) j(j) (j existing variable)
wide to long: reshape long a b, i(i) j(j) (j new variable)
r(111);
I'd appreciate any advice -- should I match in a different way that puts it in long format from the beginning?
Thank you in advance.
Matching code:
// READ IN DATA FILE OF COMBINED CASES & CONTROLS
set seed 1234 // OR YOUR FAVORITE SEED
// GENERATE AGE GROUPS (MODIFY LIMITS AS APPROPRIATE TO DATA)
gen byte year_group = 1 if yearbirth==2008
replace year_group = 2 if yearbirth==2009
replace year_group = 3 if yearbirth==2010
replace year_group = 4 if yearbirth==2011
replace year_group = 5 if yearbirth==2012
replace year_group = 6 if yearbirth==2013
replace year_group = 7 if yearbirth==2014
replace year_group = 8 if yearbirth==2015
replace year_group = 9 if yearbirth==2016
replace year_group = 10 if yearbirth==2017
gen double shuffle = runiform() // TO RANDOMIZE MATCH SELECTIONS
// FORM A FILE OF CONTROLS ONLY
preserve
keep if case_control == 0
// ASSIGN A PRIORITY FOR MATCHING WITHIN EACH YEAR_GROUP COMBINATION
// IN BATCHES OF (UP TO) THREE
by year_group (shuffle), sort: gen int priority = floor((_n-1)/3) + 1
drop shuffle
// RENAME VARIABLES TO AVOID CLASH
rename * *_ctrl
foreach x in year_group priority {
rename `x'_ctrl `x'
}
tempfile controls
save `controls'
// NOW MAKE A FILE OF CASES
restore
keep if case_control == 1
// AGAIN PRIORITIZE FOR MATCHING
by year_group (shuffle), sort: gen int priority = _n
drop shuffle
// MERGE WITH CONTROLS
merge 1:m year_group priority using `controls', keep(master match)
Related Posts with Reshape matched case control data from wide to long format
Balance an unbalanced datasetHi all, I have a strongly unbalanced dataset of countries observed by year. I would like to balance…
Weekly meansHello Everyone!! I need to obtain weekly means for some measurements. I need a weekly mean by subjec…
Using matching with DID for repeated cross sectional dataI am new to STATA and working on repeated cross-sectional data for Difference-in-Difference analysis…
Replace missing values with values from other observationsHello, I´m stucked with my data cleaning. Below, a excerpt of my dataset is shown. I only have incl…
Add on consecutively by groupHi all, I have the first observation for each id for experience, and I would like to replace the v…
Subscribe to:
Post Comments (Atom)
0 Response to Reshape matched case control data from wide to long format
Post a Comment