How to identify duplicates in the PSID sibling file

Dear all,

I work with the sibling data of the PSID (documentation: https://simba.isr.umich.edu/FIMS/FIMS_UG.pdf). I want to drop duplicate observations.

In the docu, it says:

"The resulting output file will be a customized data set fit to your specifications. The sibling pairs will be in duplicate form, where ‘Sibling A’ and ‘Sibling B’ will be listed as two observations, once as AB and again as B-A. This allows each researcher to make analytical decisions as to which individual is the focal individual, and which is the sibling of the focal individual."

So, lets suppose my data is structured as follows:

Code:

clear

input long pid long SIBNUM long pids long family
1 1 2 1
2 1 1 1
33 1 34 44
34 1 33 44
33 2 35 44
34 2 35 44
35 1 33 44
35 2 34 44
end

With pid being the personal identifier. pids is the personal identifier of the respective sibling. SIBNUM is the number of the sibling. Family is an family identificator, based on pid and psid.
At the end, I want to have the sibling pair 33-34 only once, whereas it exists twice in the sibling data.

Any suggestion on how to implement is highly appreciated. Thank you very much.

Best

Daniel

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / How to identify duplicates in the PSID sibling file
How to identify duplicates in the PSID sibling file

0 Response to How to identify duplicates in the PSID sibling file

Post a Comment

Home / Data Cleaning / Data management / Data Processing / How to identify duplicates in the PSID sibling file How to identify duplicates in the PSID sibling file

Related Posts with How to identify duplicates in the PSID sibling file

0 Response to How to identify duplicates in the PSID sibling file

Post a Comment

Home / Data Cleaning / Data management / Data Processing / How to identify duplicates in the PSID sibling file
How to identify duplicates in the PSID sibling file