Hello,

I am working with a panel data comprising of two waves written as Survey 1 and 2. The observations in survey 1 are correctly specified. However, in survey 2 some observations have been recorded twice or more due to some error.

As a result, I am getting more observations in the second wave.

IHDS1 |
(2005) or |
IHDS2 |
(2012) | Freq. Percent Cum.
------------+-----------------------------------
IHDS1 1 | 29,397 49.30 49.30
IHDS2 2 | 30,231 50.70 100.00
------------+-----------------------------------
Total | 59,628 100.00

I have attached an example of the data set using dataex. From dataex, id 8 is appearing four times instead of twice. Is there a way for me to correct this? I have tried 'duplicates report' but it did not work.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int SURVEY double HHBASE float id1
1 1010201010  1
2 1010201010  1
1 1010201020  2
2 1010201020  2
1 1010201030  3
2 1010201030  3
1 1010201040  4
2 1010201040  4
1 1010201050  5
2 1010201050  5
1 1010201070  6
2 1010201070  6
1 1010201080  7
2 1010201080  7
1 1010201090  8
2 1010201090  8
2 1010201090  8
2 1010201090  8
1 1010201100  9
2 1010201100  9
1 1010201120 10
2 1010201120 10
1 1010201130 11
2 1010201130 11
1 1010201140 12
2 1010201140 12
1 1010201160 13
2 1010201160 13
1 1010201170 14
2 1010201170 14
1 1010201180 15
2 1010201180 15
1 1010201190 16
2 1010201190 16
1 1010201200 17
2 1010201200 17
1 1010202010 18
2 1010202010 18
1 1010202020 19
2 1010202020 19
1 1010202030 20
2 1010202030 20
1 1010202040 21
2 1010202040 21
1 1010202060 22
2 1010202060 22
1 1010202070 23
2 1010202070 23
1 1010202100 24
2 1010202100 24
1 1010202110 25
2 1010202110 25
1 1010202140 26
2 1010202140 26
1 1010202150 27
2 1010202150 27
1 1010202160 28
2 1010202160 28
1 1010202170 29
2 1010202170 29
1 1010202180 30
2 1010202180 30
1 1010202190 31
2 1010202190 31
1 1010202200 32
2 1010202200 32
1 1010203020 33
2 1010203020 33
1 1010203050 34
2 1010203050 34
1 1010203060 35
2 1010203060 35
1 1010203070 36
2 1010203070 36
1 1010203090 37
2 1010203090 37
1 1010203100 38
2 1010203100 38
2 1010203100 38
2 1010203100 38
1 1010203110 39
2 1010203110 39
1 1010203120 40
2 1010203120 40
1 1010203130 41
2 1010203130 41
1 1010203140 42
2 1010203140 42
1 1010203150 43
2 1010203150 43
1 1010203160 44
2 1010203160 44
1 1010203170 45
2 1010203170 45
1 1010203190 46
2 1010203190 46
1 1010204010 47
2 1010204010 47
1 1010204040 48
2 1010204040 48
end
label values SURVEY SURVEY
label def SURVEY 1 "IHDS1 1", modify
label def SURVEY 2 "IHDS2 2", modify