name
|
date
|
dob
|
dup
|
id
|
1/01/2019 | 100001 | |||
n1 | 5-Feb-80 | 100002 | ||
n1 | 9/11/2018 | 100003 | ||
n2 | 5/03/2019 | 14-Apr-83 | 1 | 100004 |
n2 | 12/03/2019 | 14-Apr-83 | 2 | 100005 |
n3 | 12-Dec-78 | 100006 | ||
n3 | 16/02/2019 | 6-Sep-99 | 100007 | |
n4 | 14/01/2019 | 27-May-85 | 1 | 100008 |
n4 | 14/05/2019 | 27-May-85 | 2 | 100009 |
n4 | 27-May-85 | 3 | 100010 | |
n5 | 30/04/2019 | 6-Nov-98 | 100011 | |
n6 | 19/02/2019 | 2-Feb-99 | 100012 | |
n7 | 5-Jul-79 | 100013 | ||
n8 | 2/11/2018 | 5-Jul-79 | 100014 | |
n9 | 28/02/2019 | 28-May-78 | 1 | 100015 |
n9 | 8/07/2019 | 28-May-78 | 2 | 100016 |
1. Identify duplicate cases
Code:
sort name dob date quietly by name dob: gen dup=cond(_N==1,0,_n)
Code:
set seed 1234 gen rand=runiform() sort rand gen id=_n+100000
Code:
egen group= group(name dob) egen xid=min(id),by(group)
I am using Stata16.
Any advice will be greatly appreciated.
Thanks heaps in advance!
Jen
0 Response to Duplicate cases –assign single random id to all duplicate cases
Post a Comment