BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Consistently sorting data ahead of generating duplicates
Consistently sorting data ahead of generating duplicates

Dear community,

I am currently trying to identify different individuals(across several years) within a dataset, whi have been given the same identifyer.
To do this I wanted to generate two variables identifying duplicates in terms of:
1. the ID used
and
2. the ID in combination with sex and birthday

sort person_id
quietly by person_id : gen dupIDLT = cond(_N==1,0,_n)

sort person_id person_id birthday sex
quietly by person_id birthday sex: gen dupLT = cond(_N==1,0,_n)

However, when generating these there may be 3 dupicates each, but dupIDLT may be numbered 1,2,3 while dupLT is numbered 1,3,2 for the observations in years 2005-2007.

How can I achieve that both are numbered 1,2,3?

Best wishes,
Jil

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Consistently sorting data ahead of generating duplicates
Consistently sorting data ahead of generating duplicates

0 Response to Consistently sorting data ahead of generating duplicates

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Consistently sorting data ahead of generating duplicates Consistently sorting data ahead of generating duplicates

Related Posts with Consistently sorting data ahead of generating duplicates

0 Response to Consistently sorting data ahead of generating duplicates