Hello,
I'm trying to find a more concise/efficient way to identify duplicate data. It's not as straightforward as it's been in the past.
I have record-ids, start time of a first event, and end time of first event. This is then repeated in each row for as many events per record_id.
Example:
record_id StartTime EndTime
39 07jul2012 22:20:00 07jul2012 22:44:59
39 07jul2012 22:44:59 07jul2012 23:00:00
39 07jul2012 23:00:00 07jul2012 23:34:00
39 07jul2012 23:34:00 08jul2012 00:13:00
39 08jul2012 00:13:00 08jul2012 01:30:00
39 08jul2012 01:30:00 08jul2012 03:30:00
39 08jul2012 03:30:00 08jul2012 03:59:59
39 08jul2012 03:59:59 08jul2012 04:12:00
39 08jul2012 04:12:00 08jul2012 07:41:00
39 08jul2012 07:41:00 08jul2012 07:43:00
39 08jul2012 07:43:00 08jul2012 09:32:00
39 08jul2012 09:33:00 08jul2012 12:11:59
39 08jul2012 12:11:59 08jul2012 12:30:00
39 08jul2012 12:30:00 08jul2012 12:36:00
39 08jul2012 12:36:00 08jul2012 15:32:00
39 08jul2012 15:32:00 08jul2012 17:16:00
39 08jul2012 17:16:00 08jul2012 18:53:00
39 08jul2012 18:53:00 08jul2012 20:19:59
39 08jul2012 20:19:59 08jul2012 20:38:00
39 08jul2012 20:37:00 08jul2012 21:00:00
39 08jul2012 21:00:00 08jul2012 21:30:00
39 08jul2012 21:30:00 08jul2012 22:05:00
I have my data sorted by record_id starttime and endtime in sequential order. As you can see, the endtime is the same as the start time for the following line of data, for the same record_id. Is there a way to clean my dataset in STATA so that my endtime, is the last true endtime by identifying duplicates for endtime based on the subsequent starttime?
Thank you!
Related Posts with Duplicate Data
Major problem importing Excel fileDear all, I have a major problem importing data from an excel sheet into Stata - specifically I get …
Using bysort egenHi everyone, I am using the command Code: bysort $id: egen . The result I got containing the missi…
not estimable margins after reghdfeDear Statalisters, I have a problem in computing the margins of a linear fixed effects regression th…
Psmatch2 for prepensity score matching with specific categorical variablesHello Statalist community. I am currently trying to do a propensity score matching with the psmatch…
Help with Regression Interaction Term CommandDear all, I am using Stata 16, on mac. I need help estimating the following model Array Where educ…
Subscribe to:
Post Comments (Atom)
0 Response to Duplicate Data
Post a Comment