I have a (panel) dataset that contains the following variables for each observation: person_id - year - treatment.
Treatment is one of 65 categories, year: between 2012-2017 and person_id is a person identifier - unique to each person. Within a year person_id might appear more than once.
I would like to create a dummy variable which equals one for an observation if person_id appears more than once within the year of the observation, given that among the different observations there are at least 2 different treatment types.
If I do the following commands:
bysort person_id year: egen volume = count(treatment)
replace volume = 0 if volume == 1
replace volume = 1 if volume > 0
I get a dummy 'volume' which equals 1 for observations if the person_id appears more than once in the year of the observations. However, with these commands alone I cannot distinguish observations where person_id appears more than once in the year of observation and all treatment types are similar from those where there are at least 2 different treatment types.
Any ideas on how to do this?
Help is as usual much appreciated!
Thanks in advance.
0 Response to Generating dummy variables based on variables with multiple categories
Post a Comment