I am working with a dataset with merged data from 2 survey rounds (2005 & 20011).
The first column/variable 'id' is the unique identification number for the individuals.
The second column/variable "SURVEY" represents whether the observation is from survey 1 or 2.
I want to only keep the observations that are present in both surveys.
Currently, the data has been sorted by 'id' and you may notice there are two observations for the same individual, one from survey round 1 and the other from round 2. That's exactly what I need. However, is there a way to drop the observations that are missing the observation for the other round?
Here's a sample of the data:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str12 id int SURVEY "101020102010" 1 "101020102010" 2 "101020102011" 2 "101020102014" 2 "10102010205" 1 "10102010206" 1 "10102010206" 2 "10102010207" 1 "10102010207" 2 "10102010304" 1 "10102010305" 1 "10102010305" 2 "10102010306" 1 "10102010306" 2 "10102010307" 2 "10102010307" 1 "10102010403" 1 "10102010404" 1 "10102010404" 2 "10102010405" 2 "101020105010" 2 "101020105010" 1 "10102010507" 1 "10102010508" 1 "10102010509" 1 "10102010509" 2 "10102010708" 2 "10102010708" 1 "10102010709" 2 "10102010709" 1 "10102010804" 2 "10102010805" 2 "10102010806" 2 "10102010904" 1 "10102010906" 1 "10102010907" 2 "10102010907" 1 "10102010908" 1 "10102010908" 2 "10102010909" 2 "10102010909" 1 "101020110010" 2 "10102011203" 2 "10102011204" 2 "10102011304" 2 "10102011304" 1 "10102011305" 1 "10102011306" 1 "10102011306" 2 "10102011307" 2 "10102011307" 1 "10102011403" 2 "10102011403" 1 "10102011404" 2 "10102011404" 1 "10102011405" 1 "10102011405" 2 "10102011406" 2 "10102011407" 2 "10102011605" 2 "10102011605" 1 "10102011606" 2 "10102011606" 1 "10102011607" 1 "10102011607" 2 "10102011702" 1 "10102011703" 1 "10102011704" 1 "10102011704" 2 "10102011705" 1 "10102011705" 2 "101020118010" 1 "10102011806" 1 "10102011806" 2 "10102011807" 2 "10102011807" 1 "10102011808" 2 "10102011903" 1 "10102011903" 2 "10102011904" 2 "10102011904" 1 "10102011905" 2 "10102011905" 1 "10102011906" 2 "10102012007" 1 "10102012007" 2 "10102012008" 1 "10102012008" 2 "10102020103" 1 "10102020104" 1 "10102020105" 2 "10102020105" 1 "10102020303" 2 "10102020304" 2 "10102020305" 2 "10102020403" 1 "10102020405" 1 "10102020405" 2 "10102020505" 1 "101020206010" 2 end label values SURVEY SURVEY label def SURVEY 1 "IHDS1 1", modify label def SURVEY 2 "IHDS2 2", modify
0 Response to Need help with sampling process
Post a Comment