BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

How to Keep Duplicated Variables based on the Value of another Column
How to Keep Duplicated Variables based on the Value of another Column

Hello all!

I am working on cleaning up a dataset, and I do not know how to do so. Here are some of the relevant parts of the data set that has approximately 400,000 observations, with duplicates based on an ID number. What I want to do is keep the case IDs that correspond to the highest outcome. So, for the following:

ID Number	Date	Outcome
3	2/2/22	4
3	2/2/22	3
3	2/2/22	3
3	2/2/22	2

I want to keep only the first row because it has the highest code. Some IDs have 5 corresponding values for outcomes; some have 2; I think one even has 10.

I have tried to google this, but got very confused by duplicates and dups. I'd really appreciate any suggestions anyone has.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / How to Keep Duplicated Variables based on the Value of another Column
How to Keep Duplicated Variables based on the Value of another Column

0 Response to How to Keep Duplicated Variables based on the Value of another Column

Post a Comment

Home / Data Cleaning / Data management / Data Processing / How to Keep Duplicated Variables based on the Value of another Column How to Keep Duplicated Variables based on the Value of another Column

0 Response to How to Keep Duplicated Variables based on the Value of another Column