Dear community,
I am currently trying to identify different individuals(across several years) within a dataset, whi have been given the same identifyer.
To do this I wanted to generate two variables identifying duplicates in terms of:
1. the ID used
and
2. the ID in combination with sex and birthday
sort person_id
quietly by person_id : gen dupIDLT = cond(_N==1,0,_n)
sort person_id person_id birthday sex
quietly by person_id birthday sex: gen dupLT = cond(_N==1,0,_n)
However, when generating these there may be 3 dupicates each, but dupIDLT may be numbered 1,2,3 while dupLT is numbered 1,3,2 for the observations in years 2005-2007.
How can I achieve that both are numbered 1,2,3?
Best wishes,
Jil
Related Posts with Consistently sorting data ahead of generating duplicates
Dummy Variable - Add time variableDear Statalister, I would like to ask how can I create a code for the following variable (See attac…
Flag repetitionsSuppose I have a dataset containing two variables X and Y. Without resorting the data, I would like …
Arellano-Bond in Stata: xtabond, xtabond2, or xtdpdgmm if I'm including an interaction? And can I enter differenced focal variables?Hi, I'm trying to run a dynamic panel model on a panel dataset in Stata 14.1. I have a couple questi…
Record linkage with quantitative variablesI have two datasets describing the same 80 schools. Some variables are similar -- for example, both…
Another way of measuring period effect on investment risk tolerance?I'm doing a research about the relationship between investment risk tolerance and demographic factor…
Subscribe to:
Post Comments (Atom)
0 Response to Consistently sorting data ahead of generating duplicates
Post a Comment