I need to drop some observations of my dataset. I have aprox.115,000 observations of different households in two different periods. Some households were surveyed twice, so in some cases I have information for the same household for the two different years. In other cases, some households were surveyed only in one of the two years. In addition, some households appear manny times because there different members of the family answering the survey. So, some households are repetead because the family has more than one member or because it was surveyed in two different years (or both things at the same time). I need to make two different datasets from the original one:
- First, I need to keep only that observations (households) that were surveyed in the two years, droping the ones wich were surveyed in only one the that years.
- Second, I need to drop the units (households) wich were surveryed in the two years, but only droping the observations for first year, and keeping that units for the second year. I mean, housold A appears 4 times, because it has two members wich answer the poll and because it was surveyed twice: in 2014 and in 2015. I need to keep this household only when year is equal to 2015, and drop it when year is equal to 2014.
- Household ID
- Year
- Number of member
I hope I were clear.
Thank you in advance.
0 Response to Keeping observations under many criteria (when they appear in more than one period)
Post a Comment