Hi everyone
I have two datasets. The master data include, among others, a list of (numeric) zip codes as unique IDs.
The other data set (see example below) includes also a bunch of zip codes and information about the day a specific event happens in the respective zip code area:
input str6 ID str33 zipcode float day
"DE0007" "39130" 21793
"DE0007" "39120" 21793
"DE0007" "39124, 39126, 39128" 21793
"DE0007" "39171" 21793
"DE0007" "39108, 39110" 21793
"DE0007" "39108, 39110, 39130" 21793
"DE0007" "39126" 21793
"DE0007" "39122" 21793
end
format %td day
In general, I want to include the event-day-information ("day") from the second dataset into my master data, using the zip code as identifier. Therefore, I would try to merge the two data sets using the zip codes as identifiers. However, it is my understanding that to achieve this I would have to "destring" the zipcode-variable in the second dataset into observations (instead of variables).
For instance, the data row ""DE0007" "39124, 39126, 39128" 21793" should look like this:
"DE0007" "39124" 21793
"DE0007" "39126" 21793
"DE0007" "39128" 21793
Unfortunately, I'm not aware of a command that splits a string variables into observations instead of new variables. Does anybody have an idea for a workaround?
And as a "side question": Is merging the only way to include the event-day-information into the master data? I would prefer to use some command like "VLOOKUP" for different sheets in Excel (but in this case for different stata datasets) instead.
Thank you very much!
0 Response to Splitting a string variable into observations (instead of variables)
Post a Comment