Hi folks. I have data in the following format. Each observation represents a zip code, n_sites denotes the number of sites where people from that zip code were interviewed, n_choicesets is equal to n_sites+1, and sites_visited is a list of the distinct sites where people from that zip code were interviwed. The following is an example of the data structure:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long zip byte(n_sites n_choicesets) str42 sites_visited
11223 1 2 "Coconut Beach"
60660 3 4 "Coconut Beach Western Beach Turtle Beach"
48912 1 2 "Western Beach"
66206 2 3 "Turtle Beach Western Beach"
end
I want to figure out a way to turn the above data structure into this:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long zip byte(n_sites n_choicesets) str13 sites_visited
11223 1 2 "Coconut Beach"
11223 1 2 "No trip"
60660 3 4 "Coconut Beach"
60660 3 4 "Western Beach"
60660 3 4 "Turtle Beach"
60660 3 4 "No trip"
48912 1 2 "Western Beach"
48912 1 2 "No trip"
66206 2 3 "Turtle Beach"
66206 2 3 "Western Beach"
66206 2 3 "No trip"
end
so each zip code now is expanded by n_choicesets, and each zip code observation is matched with one of the sites from the original sites_visited list. The last observation for each zip code takes on the value "No trip" in the sites_visited column. I've done this before a while back, and I can't for the life of me remember how. Any help would be appreciated! thank you.
0 Response to Help with re-formatting a dataset for choice modelling?
Post a Comment