Hello everyone,
I have a panel dataset with vehicle-id and refueling dates. Now I am cleaning this data and I am trying to randomly delete duplicate refueling dates for each vehicle id. This means that I cannot just use the duplicates command because I think all the duplicate refueling dates for other vehicle IDs will also be lost which I don't want.
Here is what I think might work but any other suggestions will be appreciated
set seed 1234
gen double shuffle1 = runiform()
gen double shuffle2 = runiform()
bysort vehicleid (fuelingdate shuffle1 shuffle2): keep if _n==1
drop shuffle
Kindly provide me with some suggestions for this:
vehicle id | refueling dates |
13 | 13feb2021 |
13 | 13feb2021 |
13 | 26feb2021 |
13 | 13feb2021 |
13 | 21mar2021 |
0 Response to Trying to delete date duplicates for each panelID
Post a Comment