I need help grouping/creating variables based on string matching. For each patient, they get a new row for each diagnosis. I want to group the diagnosis by drug. So if they have any sort of cannabis diagnosis, either 1 or multiple related to cannabis, I want a variable that outputs a binary 1 for cannabis. If they don't have any cannabis diagnosis, the variable will have a 0.
I generated new variables based on specific drug using
Code:
gen byte cannabis = strmatch(ldiagnosis, "*cannabis*")
This isn't doing what I want because there are still multiple rows per patient. I want each patient to have ONE row with all newly created drug variables having either a 0 or 1. Is this possible and how do I do it!
How it is:
PatientID |
Diagnosis |
1 |
Cannabis dependence, unspecified |
1 |
Cannabis dependence, uncomplicated |
1 |
Cannabis abuse |
1 |
Cocaine abuse |
1 |
Opioid abuse |
2 |
Cocaine abuse |
2 |
Cocaine dependence |
2 |
Cannabis dependence, unspecified |
2 |
Alcohol abuse |
What I want:
PatientID |
Cannabis |
Cocaine |
Opioid |
Alcohol |
1 |
1 |
1 |
1 |
0 |
2 |
1 |
1 |
0 |
1 |
0 Response to String matching to create multiple variables
Post a Comment