obsdiff command

Hello everyone,
I have a large dataset with many duplicates across different variables. I am trying to find what are the differences within the duplicates. I came across the obsdiff ( by Eric Booth) command and tried using it but I am not sure how to specify within duplicates not rows.
I have created a sample table similar to what I am trying to do. I need to find the differences in DOB, nationality , gender and result within duplicates of ID .
For examples : what are the differences in DOB, nationality , gender and result within duplicates of ID 1 ?

ID	DOB	Nationality	age	gender	result
1	1996	Jordan	25	F	P
1	1996	Jordan	25	F	P
1	1996	Egypt	25	F	P
1	1997	Egypt	25	F	N
1	1997	Jordan	25	F	N
1	1996	Qatar	24	F	P
2	1995	Lebanon	12	M	N
2	1995	Lebanon	12	M	N
2	1995	Lebanon	14	M	P
2	1995	Lebanon	11	M	P
2	1995	Lebanon	12	M	P
3	1998	Syria	21	F	N
4	1996	Syria	22	F	P
5	2000	Qatar	23	F	N

The code I have been using is :
obsdiff DOB Nationality age gender result , row (1/15).

I want to do the same command but within duplicates of each ID without listing the rows for each group of ID duplicates ( my original dataset has millions of duplicates) .
Is this possible within this command ?

Thank you !
Heba

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / obsdiff command
obsdiff command

0 Response to obsdiff command

Post a Comment

Home / Data Cleaning / Data management / Data Processing / obsdiff command obsdiff command

0 Response to obsdiff command

Post a Comment

Home / Data Cleaning / Data management / Data Processing / obsdiff command
obsdiff command