Hello,
I have a case control study with 59 cases matched 1:3 with controls on birth year. The match was completed adapting code from threads in this forum (e.g. "Matching cases and controls based age and gender") code below. The data is now in wide format where there are a total of 177 observations with three entries per case with its corresponding control. There are a total of 300 variables in the wide format (half for the case (e.g. m_age, i_bw, etc), half for the control (denoted by _ctrl; e.g. m_age_ctrl, i_bw_ctrl, etc)
The wide format is great for mcc analysis but I am also interested in using clogit and I cannot figure out how to reshape the data into a long format to allow this. I have found examples of how to reshape from long to wide but not the other way around.
As an example, I have tried reshape long m_age* i_bw*, i(record_id) j(casecon) but receive the following error message:
no xij variables found
You typed something like reshape wide a b, i(i) j(j).
reshape looked for existing variables named a# and b# but could not find any. Remember this picture:
long wide
+---------------+ +------------------+
| i j a b | | i a1 a2 b1 b2 |
|---------------| <--- reshape ---> |------------------|
| 1 1 1 2 | | 1 1 3 2 4 |
| 1 2 3 4 | | 2 5 7 6 8 |
| 2 1 5 6 | +------------------+
| 2 2 7 8 |
+---------------+
long to wide: reshape wide a b, i(i) j(j) (j existing variable)
wide to long: reshape long a b, i(i) j(j) (j new variable)
r(111);
I'd appreciate any advice -- should I match in a different way that puts it in long format from the beginning?
Thank you in advance.
Matching code:
// READ IN DATA FILE OF COMBINED CASES & CONTROLS
set seed 1234 // OR YOUR FAVORITE SEED
// GENERATE AGE GROUPS (MODIFY LIMITS AS APPROPRIATE TO DATA)
gen byte year_group = 1 if yearbirth==2008
replace year_group = 2 if yearbirth==2009
replace year_group = 3 if yearbirth==2010
replace year_group = 4 if yearbirth==2011
replace year_group = 5 if yearbirth==2012
replace year_group = 6 if yearbirth==2013
replace year_group = 7 if yearbirth==2014
replace year_group = 8 if yearbirth==2015
replace year_group = 9 if yearbirth==2016
replace year_group = 10 if yearbirth==2017
gen double shuffle = runiform() // TO RANDOMIZE MATCH SELECTIONS
// FORM A FILE OF CONTROLS ONLY
preserve
keep if case_control == 0
// ASSIGN A PRIORITY FOR MATCHING WITHIN EACH YEAR_GROUP COMBINATION
// IN BATCHES OF (UP TO) THREE
by year_group (shuffle), sort: gen int priority = floor((_n-1)/3) + 1
drop shuffle
// RENAME VARIABLES TO AVOID CLASH
rename * *_ctrl
foreach x in year_group priority {
rename `x'_ctrl `x'
}
tempfile controls
save `controls'
// NOW MAKE A FILE OF CASES
restore
keep if case_control == 1
// AGAIN PRIORITIZE FOR MATCHING
by year_group (shuffle), sort: gen int priority = _n
drop shuffle
// MERGE WITH CONTROLS
merge 1:m year_group priority using `controls', keep(master match)
Related Posts with Reshape matched case control data from wide to long format
How to calculate a value from a stringColleagues, I have a dataset with a land-use description codified with a alphanumeric string. Some …
Question about Stata close button on MacBook?Dear all, My Stata version is Stata 17 (MP 4) and on MacBook Air.I find that the close button's colo…
How STATA Works With Missing Data in Panel Data RegressionI will be grateful if anyone here could help me with panel data regression using STATA. I am current…
keep the variable if it's value(name) exist in another fileHello! I have two databases, one is in CSV and the other is on XLSX. Both files contain a variable …
How to retrieve all the coefficient of independent variables of did_imputation package?Hi all, Normally, in a research paper, we need to present a table of coefficients of all independen…
Subscribe to:
Post Comments (Atom)
0 Response to Reshape matched case control data from wide to long format
Post a Comment