BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Keep the duplicate observation based on the highest value of a variable
Keep the duplicate observation based on the highest value of a variable

Dear Statalist,

I recently combined multiple files (filename_v1, filename_v2, etc.) into a single dataset and now I have duplicate observations (the datestamp variable should be enough but I would like to deal with duplicates from both datestamp and resp_email to be sure). I want to keep the duplicate observation from the file with the highest number at the end, since this may vary.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str19 datestamp str44 resp_email str7 filename
"2020-06-18 09:53:22" "email" "FILE_v2"
"2020-06-18 09:53:22" "email" "FILE_v3"
end

Here I would just like to keep the second observation, but if the files were "FILE_v4" and "FILE_v3" I would like to have only the FILE_v4 because it is indeed the one with the highest number.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Keep the duplicate observation based on the highest value of a variable
Keep the duplicate observation based on the highest value of a variable

0 Response to Keep the duplicate observation based on the highest value of a variable

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Keep the duplicate observation based on the highest value of a variable Keep the duplicate observation based on the highest value of a variable

Related Posts with Keep the duplicate observation based on the highest value of a variable

0 Response to Keep the duplicate observation based on the highest value of a variable