Dear Statalist,

I recently combined multiple files (filename_v1, filename_v2, etc.) into a single dataset and now I have duplicate observations (the datestamp variable should be enough but I would like to deal with duplicates from both datestamp and resp_email to be sure). I want to keep the duplicate observation from the file with the highest number at the end, since this may vary.

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input str19 datestamp str44 resp_email str7 filename
"2020-06-18 09:53:22" "email" "FILE_v2"
"2020-06-18 09:53:22" "email" "FILE_v3"
end
Here I would just like to keep the second observation, but if the files were "FILE_v4" and "FILE_v3" I would like to have only the FILE_v4 because it is indeed the one with the highest number.