Dear community,
I am currently trying to identify different individuals(across several years) within a dataset, whi have been given the same identifyer.
To do this I wanted to generate two variables identifying duplicates in terms of:
1. the ID used
and
2. the ID in combination with sex and birthday
sort person_id
quietly by person_id : gen dupIDLT = cond(_N==1,0,_n)
sort person_id person_id birthday sex
quietly by person_id birthday sex: gen dupLT = cond(_N==1,0,_n)
However, when generating these there may be 3 dupicates each, but dupIDLT may be numbered 1,2,3 while dupLT is numbered 1,3,2 for the observations in years 2005-2007.
How can I achieve that both are numbered 1,2,3?
Best wishes,
Jil
Related Posts with Consistently sorting data ahead of generating duplicates
Calculating adjusted odds ratios for two variables with a single reference categoryI would like to know how I can calculate adjusted Odds Ratios with 95% CI for each (age and socioeco…
Two bugs of command tuples (version 4.0.0)I am very glad that the version 4.0.0 of command tuples adds some new features. But I also found the…
Opposite of reg if tin(xx)Hello, I am currently working with time series analysis and I want to analyze whether my regression…
Changing last character in string variableI have a string variable, which varies in length from 3 to 4 characters. For each case that the last…
Unit-root test for ordinal and for bounded outcomesDear Statalisters, I'm using Stata 15.1. I have a Likert scale from 1 to 5 as outcome, and I'm perf…
Subscribe to:
Post Comments (Atom)
0 Response to Consistently sorting data ahead of generating duplicates
Post a Comment