First time poster, so I’m sorry for any errors…
I have two ID variables (ID1 and ID2). I want to create a new ID variable taking into account duplicates in both. Where there are duplicates in EITHER ID1 OR ID2, I want to treat this as the same person. Essentially, I want to generate a new variable which looks like NewID below. I have tried using something like:
by ID1 ID2, sort: gen NewID=1 if _n==1
replace NewID = sum(NewID)
but this only takes into account where there are duplicates across BOTH ID1 and ID2. I guess something like the below would be ideal, but Stata doesn't let me put in the | symbol into this
by ID1 | ID2, sort: gen NewId=1 if _n==1
replace NewID = sum(NewID)
I should also add that ID1 and ID2 are not ordered consistently, so I can’t just use _n-1
ID1 ID2 NewID
1 a 1
1 b 1
2 b 1
3 c 2
1 g 1
4 c 2
5 d 3
5 e 3
6 f 4
Any help would be very much appreciated!! Thank you!
Related Posts with Generating new ID variable taking into account duplicates across 2 other variables
Exporting underlying data behind stata generated graphsHi, i was able to export the underlying data behind a graph i created using a dataset (below) but wh…
Subpop MLM: Xtmixed Hi all, I am running a multi-level modeling using complex survey data, which used a stratified, clu…
How to compare prognostic models with a survivaldecision curve analysis for survival outcomes?Hello, i am currently using the dca (stdca) command in stata 14.0 I research a dataset of patients w…
Sample selection in the control function approachI am trying to understand what sample it is correct to use when estimating the models using the cont…
Can't open miest.sterDear Statalist, I am trying do run a Cox regression on an imputed dataset using restricted cubic sp…
Subscribe to:
Post Comments (Atom)
0 Response to Generating new ID variable taking into account duplicates across 2 other variables
Post a Comment