Hello there,

I have a survey dataset of patients of those including congenital heart disease. They have multiple ICD9 codes. I would like to group them into 3 categories according to their severity.

The issue I encounter is when a person has more than one disease condition, and I would like to put them under their highest category.

Severe disease : "745.0", "745.1", "745.10", "745.11" , "745.12", "745.19", "745.2", "745.3", "745.6" "745.60"
Moderate disease: "746.0", "746.00", "746.02", "746.09", "746.2" , "746.3" , "746.4", "746.5", "746.6"
Mild disease: "745.4", "745.5", "745.8", "745.9", "747.0", "747.10"

I would be able to put them into different category with their codes like below.
1/25 is because there are 25 columns for the codes for diagnosis from DX1 to DX25. CHD= congenital heart disease


3=severe
2=2 moderate
1=mild

gen CHD=0
forvalues j=1/25{
replace CHD =3 if inlist(DX`j', "745.0", "745.1", "745.10", "745.11" , "745.12", "745.19", "745.2", "745.3", "745.6" "745.60" )
replace CHD=2 if inlist(DX`j', "746.0", "746.00", "746.02", "746.09", "746.2" , "746.3" , "746.4", "746.5", "746.6" )
replace CHD=1 if inlist(DX`j', "745.4", "745.5", "745.8", "745.9", "747.0", "747.10" )
}
label variable CHD "Congenital Heart Disease"


My question is: how does STATA treat if a single subject has disease one from more than one category with my code? My next question is- if a person has diseases from two different categories, how can I put them into their highest category( Severe >Moderate>mild).

Thank you