Removing duplicate ID's by creating additional variable

Hi All,

I would like to change this dataset such that each ID only appears one time, with additional columns representing "Father" and "Mother." For example, ID 4001 appears twice, where "PTYPE" (parent type) has two entries -- F and M. How can I create new columns "Father" and "Mother" where there is a 1 or 0 entry? After this, how can I compress the data, such that each ID only appears one time? I would like to do the same thing with the variable "PNP," so please ignore that column for now.

Many thanks in advance,
Cora

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(ID68 PN) byte GID float(ID68P PNP) str2 PTYPE float ID
4  1 2 4 906 "F" 4001
4  1 2 4 907 "M" 4001
4  2 2 4 908 "F" 4002
4  2 2 4 909 "M" 4002
4  3 3 4   1 "F" 4003
4  3 3 4   2 "M" 4003
4  4 3 4   1 "F" 4004
4  4 3 4   2 "M" 4004
4  5 3 4   1 "F" 4005
4  5 3 4   2 "M" 4005
4  6 3 4   1 "F" 4006
4  6 3 4   2 "M" 4006
4  7 3 4   1 "F" 4007
4  7 3 4   2 "M" 4007
4  8 3 4   1 "F" 4008
4  8 3 4   2 "M" 4008
4 30 4 4   5 "M" 4030
end

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Removing duplicate ID's by creating additional variable
Removing duplicate ID's by creating additional variable

0 Response to Removing duplicate ID's by creating additional variable

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Removing duplicate ID's by creating additional variable Removing duplicate ID's by creating additional variable

Related Posts with Removing duplicate ID's by creating additional variable

0 Response to Removing duplicate ID's by creating additional variable

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Removing duplicate ID's by creating additional variable
Removing duplicate ID's by creating additional variable