Hello,

I am struggling with something that I wanted to share. I have an unbalanced panel dataset. The panel has been described as follows:

xtset xwaveid wave, year
Here, the wave indicates the year, and xwaveid indicates the unique id for each respondent to the survey. However, it is the case that several people from the same household can have been surveyed as well. Households are enumerated using hhid, however this id is specific for that particular wave, and changes every wave.

Basically, what I want to do is, for each wave (2006-2014):

- Look at each hhid, and find the number of different xwaveid's under that hhid for that wave.
- If there is only one xwaveid for that hhid then there is no problem for that wave.
- If, however, there are more than one xwaveid's for that hhid for that wave, then I only want to keep the xwaveid for that wave which is older, by looking at hgage.

So I want to remove duplicates at a yearly level and not over the entire dataset. I know what I want to do but I could not translate it into a coding language. I have below provided an example of my data. I would highly appreciate any help.

Best,
Merve

P.S. Basically while it is possible to create a longitudinal individual dataset, I am trying to construct an individual dataset so that individuals will be representative of distinct households over time.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long xwaveid float wave long hhid int hgage
 108110 2009     81 36
 400830 2008   1061 24
 115868 2008   1061 26
 109993 2007   1431 25
 700012 2007   1431 30
 109850 2006   1521 26
 118285 2006   1561 36
 118286 2006   1561 37
 108023 2008   1631 38
 108023 2010   2381 40
 109850 2007   2441 27
 115525 2006   4071 22
 115525 2008   4171 24
 115525 2007   4541 23
 118285 2008   5991 38
 118286 2008   5991 39
 118286 2007   6471 38
 118285 2007   6471 37
 400075 2006   7801 26
 100619 2006   7801 24
 115131 2008   7971 28
 108023 2009   8241 39
 115131 2007   8641 27
 114769 2009   8891 18
 115131 2006   8901 26
 115131 2010   9352 30
 114322 2006   9881 35
 108110 2014 100111 41
 108110 2013 100121 40
 108110 2012 100181 39
 116083 2014 100631 26
 114322 2007  10131 36
 114322 2008  10331 37
1000282 2014 110251 29
 115525 2014 110251 30
1000282 2013 110281 28
 115525 2013 110281 29
 109013 2012 110511 33
 113069 2012 111501 50
 106444 2014 115121 21
 115131 2014 116851 34
 115131 2013 117292 33
 108902 2014 117391 29
 114263 2013 118321 29
 114263 2012 118751 28
 114570 2014 119452 21
 114570 2013 120381 20
 114570 2012 120881 19
 106945 2014 124391 35
 800232 2014 128451 32
 101763 2014 128451 34
 800232 2013 129981 31
 101763 2013 129981 33
 800232 2012 130691 30
 101763 2012 130691 32
 114263 2014 130991 30
 114569 2012 131211 21
1000305 2012 131211 62
 115131 2012 133951 32
 113075 2014 134071 70
1000276 2013 140491 28
 104369 2013 140491 36
1000276 2012 140531 27
 104369 2012 140531 35
 119207 2014 140701 24
 117956 2012 140761 30
 119207 2013 140771 23
 116020 2014 140771 25
 114768 2013 140861 27
 104369 2014 141031 37
1000276 2014 141031 29
 800232 2009  14191 27
 101763 2009  14191 29
 109809 2014 142551 40
 109808 2014 142551 40
 109809 2013 142681 39
 109808 2013 142681 39
 118272 2012 143251 25
 114768 2014 144221 28
 119381 2013 144861 31
 119380 2013 144861 40
 119380 2014 144881 41
 119381 2014 144881 32
 118272 2013 145321 26
 101411 2014 146651 30
 116593 2006  14691 51
 116594 2006  14691 48
 101411 2013 147041 29
 118272 2014 157451 27
 107223 2009  15921 20
 113341 2012 159371 23
 113341 2013 159831 24
 113341 2014 160171 25
 101979 2008  16131 18
 114769 2008  16891 17
 101979 2006  16961 16
 111422 2011 170021 25
 600377 2011 170021 24
 110312 2012 170021 84
 116423 2011 170031 49
end
format %ty wave
label values hgage FHGAGE