Hello everyone,
I have a panel dataset with vehicle-id and refueling dates. Now I am cleaning this data and I am trying to randomly delete duplicate refueling dates for each vehicle id. This means that I cannot just use the duplicates command because I think all the duplicate refueling dates for other vehicle IDs will also be lost which I don't want.
Here is what I think might work but any other suggestions will be appreciated
set seed 1234
gen double shuffle1 = runiform()
gen double shuffle2 = runiform()
bysort vehicleid (fuelingdate shuffle1 shuffle2): keep if _n==1
drop shuffle
Kindly provide me with some suggestions for this:
| vehicle id | refueling dates | 
| 13 | 13feb2021 | 
| 13 | 13feb2021 | 
| 13 | 26feb2021 | 
| 13 | 13feb2021 | 
| 13 | 21mar2021 | 
0 Response to Trying to delete date duplicates for each panelID
Post a Comment