I have a long list (300,000+ entries) of medications administered (med_name) and the date they were given (med_date). The list is long and I've converted to wide. Some individuals have 5 meds but others have >1000.
I need to collapse these data into a usable format, such that I am able to pull out individual meds as a dichotomous variable. For example, I'd like to create 'bloodpressure_med' as a variable, and have it coded 1/0 based on if a list of ~10 blood pressure meds is found anywhere in the list of med1, med2, med3, med4... list. I have tried the strpos(med*,"blood pressure")>0 but I receive an error that med* is not a valid variable name. It seems extremely inefficient to perform separate commands across the list of 1000+ medications to parse out individual meds for each person. I also need the med_date for that variable. so ideally I'd convert my current dataset of:
id | med1 | med1_date | med2 | med2_date | |
1 | bloodpressure | 2/1/2018 | steroid | 3/1/2018 | |
2 | steroid | 4/5/2017 | insulin | 4/5/2017 | |
3 | insulin | 1/5/2016 | |||
4 | insulin | 3/15/2017 | bloodpressure | 3/1/2016 |
into this:
id | bloodpressure | bloodpressure_date | insulin | insulin_date | |
1 | 1 | 2/1/2018 | 0 | ||
2 | 0 | 1 | 4/5/2017 | ||
3 | 0 | 1 | 1/5/2016 | ||
4 | 1 | 3/1/2016 | 1 | 3/15/2017 |
how can I do this efficiently across 1000's of variables ?
thank you in advance - I am learning so appreciate the help and patience with what may be a simple question!!
0 Response to help with parsing out a string variable and corresponding date variable across multiple different variables
Post a Comment