Dear all,

I work with the sibling data of the PSID (documentation: https://simba.isr.umich.edu/FIMS/FIMS_UG.pdf). I want to drop duplicate observations.

In the docu, it says:

"The resulting output file will be a customized data set fit to your specifications. The sibling pairs will be in duplicate form, where ‘Sibling A’ and ‘Sibling B’ will be listed as two observations, once as AB and again as B-A. This allows each researcher to make analytical decisions as to which individual is the focal individual, and which is the sibling of the focal individual."

So, lets suppose my data is structured as follows:

Code:
clear

input long pid long SIBNUM long pids long family
1 1 2 1
2 1 1 1
33 1 34 44
34 1 33 44
33 2 35 44
34 2 35 44
35 1 33 44
35 2 34 44
end
With pid being the personal identifier. pids is the personal identifier of the respective sibling. SIBNUM is the number of the sibling. Family is an family identificator, based on pid and psid.
At the end, I want to have the sibling pair 33-34 only once, whereas it exists twice in the sibling data.

Any suggestion on how to implement is highly appreciated. Thank you very much.

Best

Daniel