Hello,
I'm trying to find a more concise/efficient way to identify duplicate data. It's not as straightforward as it's been in the past.
I have record-ids, start time of a first event, and end time of first event. This is then repeated in each row for as many events per record_id.
Example:
record_id StartTime EndTime
39 07jul2012 22:20:00 07jul2012 22:44:59
39 07jul2012 22:44:59 07jul2012 23:00:00
39 07jul2012 23:00:00 07jul2012 23:34:00
39 07jul2012 23:34:00 08jul2012 00:13:00
39 08jul2012 00:13:00 08jul2012 01:30:00
39 08jul2012 01:30:00 08jul2012 03:30:00
39 08jul2012 03:30:00 08jul2012 03:59:59
39 08jul2012 03:59:59 08jul2012 04:12:00
39 08jul2012 04:12:00 08jul2012 07:41:00
39 08jul2012 07:41:00 08jul2012 07:43:00
39 08jul2012 07:43:00 08jul2012 09:32:00
39 08jul2012 09:33:00 08jul2012 12:11:59
39 08jul2012 12:11:59 08jul2012 12:30:00
39 08jul2012 12:30:00 08jul2012 12:36:00
39 08jul2012 12:36:00 08jul2012 15:32:00
39 08jul2012 15:32:00 08jul2012 17:16:00
39 08jul2012 17:16:00 08jul2012 18:53:00
39 08jul2012 18:53:00 08jul2012 20:19:59
39 08jul2012 20:19:59 08jul2012 20:38:00
39 08jul2012 20:37:00 08jul2012 21:00:00
39 08jul2012 21:00:00 08jul2012 21:30:00
39 08jul2012 21:30:00 08jul2012 22:05:00
I have my data sorted by record_id starttime and endtime in sequential order. As you can see, the endtime is the same as the start time for the following line of data, for the same record_id. Is there a way to clean my dataset in STATA so that my endtime, is the last true endtime by identifying duplicates for endtime based on the subsequent starttime?
Thank you!
Related Posts with Duplicate Data
i am using panel data and if i run the analysis it shows me "Age of PFA" is omitted because of collinearity. need help. find my data belowi am using panel data and if i run the analysis it shows me "Age of PFA" is omitted because of colli…
AVAR package problemHello, when I use eventstudyinteract I get the following message: struct ms_vcvorthog undefined (4…
Stacked bar graph: percentage of total observations, by groupsHello, I would like to produce a graph that shows the percentage of total observations (code_7), ov…
i am using panel datai am using panel data and if i run the analysis it shows me that the resul of one variable "Age of P…
Having problem reshaping my qualitative data for estimation of nested logitI have a dataset with 727 observation and 29 variables. The variables with biom and solar prefix are…
Subscribe to:
Post Comments (Atom)
0 Response to Duplicate Data
Post a Comment