Duplicate Data

Hello,

I'm trying to find a more concise/efficient way to identify duplicate data. It's not as straightforward as it's been in the past.

I have record-ids, start time of a first event, and end time of first event. This is then repeated in each row for as many events per record_id.

Example:

record_id StartTime EndTime
39 07jul2012 22:20:00 07jul2012 22:44:59
39 07jul2012 22:44:59 07jul2012 23:00:00
39 07jul2012 23:00:00 07jul2012 23:34:00
39 07jul2012 23:34:00 08jul2012 00:13:00
39 08jul2012 00:13:00 08jul2012 01:30:00
39 08jul2012 01:30:00 08jul2012 03:30:00
39 08jul2012 03:30:00 08jul2012 03:59:59
39 08jul2012 03:59:59 08jul2012 04:12:00
39 08jul2012 04:12:00 08jul2012 07:41:00
39 08jul2012 07:41:00 08jul2012 07:43:00
39 08jul2012 07:43:00 08jul2012 09:32:00
39 08jul2012 09:33:00 08jul2012 12:11:59
39 08jul2012 12:11:59 08jul2012 12:30:00
39 08jul2012 12:30:00 08jul2012 12:36:00
39 08jul2012 12:36:00 08jul2012 15:32:00
39 08jul2012 15:32:00 08jul2012 17:16:00
39 08jul2012 17:16:00 08jul2012 18:53:00
39 08jul2012 18:53:00 08jul2012 20:19:59
39 08jul2012 20:19:59 08jul2012 20:38:00
39 08jul2012 20:37:00 08jul2012 21:00:00
39 08jul2012 21:00:00 08jul2012 21:30:00
39 08jul2012 21:30:00 08jul2012 22:05:00

I have my data sorted by record_id starttime and endtime in sequential order. As you can see, the endtime is the same as the start time for the following line of data, for the same record_id. Is there a way to clean my dataset in STATA so that my endtime, is the last true endtime by identifying duplicates for endtime based on the subsequent starttime?

Thank you!

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Duplicate Data
Duplicate Data

0 Response to Duplicate Data

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Duplicate Data Duplicate Data

Related Posts with Duplicate Data

0 Response to Duplicate Data

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Duplicate Data
Duplicate Data