Hi,

I have a PHC4 dataset that has both ICD9 and 10 codes merged in the same variables (for example admitting diagnosis, billing diagnoses). I've been trying to clean up the data a little and have been having difficulty doing so with the merged data. I'm ultimately trying to see the most common diagnoses for descriptive studies, but also be able to organize it better so I can run regression analyses.

for example:
the below code works
"icd10 generate admdescr = admdx, description" and created a new variable with descriptions
but
"icd9 generate admdescr = admdx, description" does not as it states there are variables that are not ICD9 codes (which is true, although the ICD10 version worked)

I thought about trying to divide the data into a ICD9 vs ICD10 section to clean it up, and then when I re-merge it just create new variables (aki, dm, etc) to help with the regression. I'm not sure if thats the best or most efficient method. I've tried reading the official STATA ICD help materials and it hasnt helped.

Any advice would be appreciated