I am seeking a more efficient way to harmonize my data in Stata that preserves a "codebook" of the values that each variable took in the raw dataset and their re-coded values in the merged/harmonized dataset. I have single-year files from a multi-year survey that I am merging together. The same variable in different years has a different variable name and the categories may vary slightly from year to year. I'd like to have a CSV file that can serve both as a record/changelog on its own and also that I can reference in my data management code so that I am not updating it in excel and in the datasets separately. In reading through the stata manual entry on frames it seemed promising, but I can't crack it -- can someone guide me on whether it is possible to use frlink (or something else) to say find varname_old from the codebook frame in the raw data frame and rename as varname_new from the codebook frame in the raw data.
I'm thinking my codebook frame would look something like this:
year | varname_old | varname_new | recode0 | recode1 | recode2 | recode3 | recode4 |
1968 | V313 | edu_bracket_hd | 1 | 1 | 1 | 1 | 2 |
1969 | V794 | edu_bracket_hd | 1 | 1 | 1 | 1 | 2 |
1970 | V1485 | edu_bracket_hd | 1 | 1 | 1 | 1 | 2 |
The corresponding raw data would be then be three different single-year files (1968, 1969, 1970), each with its own edu_bracket_hd variable name (V313, V794, V1485).
I hope this is clear. I am open to any suggestions/advice. Thank you.
0 Response to Using frames
Post a Comment