First time poster, so I’m sorry for any errors…
I have two ID variables (ID1 and ID2). I want to create a new ID variable taking into account duplicates in both. Where there are duplicates in EITHER ID1 OR ID2, I want to treat this as the same person. Essentially, I want to generate a new variable which looks like NewID below. I have tried using something like:
by ID1 ID2, sort: gen NewID=1 if _n==1
replace NewID = sum(NewID)
but this only takes into account where there are duplicates across BOTH ID1 and ID2. I guess something like the below would be ideal, but Stata doesn't let me put in the | symbol into this
by ID1 | ID2, sort: gen NewId=1 if _n==1
replace NewID = sum(NewID)
I should also add that ID1 and ID2 are not ordered consistently, so I can’t just use _n-1
ID1 ID2 NewID
1 a 1
1 b 1
2 b 1
3 c 2
1 g 1
4 c 2
5 d 3
5 e 3
6 f 4
Any help would be very much appreciated!! Thank you!
Related Posts with Generating new ID variable taking into account duplicates across 2 other variables
Multiple imputation, descriptives and outliersHello, I have a few questions regarding MI and various things.I have 3 questions 1) Due to multipl…
Using stcrreg with the mi commandHi All, After not having much of an issue with my MI (multipley imputed) data and the stccreg comma…
How to arrange hospitalization data for sequence analysisDear Statalists, Just read a paper by Golay, P., et al., "Identifying patterns in psychiatric hospit…
Making new variable from first letter of string variableHi Statalist, Thanks for being a great forum which already has helped a lot! I'm new here, quite f…
add legend to one of the graphs in -graph combine-Dear all statalisters, I now face a problem in dealing with drawing a combined graph with adding a l…
Subscribe to:
Post Comments (Atom)
0 Response to Generating new ID variable taking into account duplicates across 2 other variables
Post a Comment