BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Combining observations
Combining observations

Hi statalist,

I have a dataset with approx 1.000.000 observations. In my dataset I have a lot of duplicates in my observations, and I only want to keep one observation for each id. The way I want it done, is so that related to one id is some information and I only want to keep the information that is most used for the first three digitals. An example could be as seen from data below that the first row should be combined to FO4D15 278 since it is the most used because of F04 is in the dataset twice. I hope it makes sense. Do you have a suggesting on have to solve this in Stata?

* Example generated by -dataex-. To install: ssc install dataex
clear
input str14 information int id
"F04D15" 278
"F04D13" 278
"H02P21" 278
"H01P21" 278
"C12Y304" 1248
"A61K38" 1248
"C12N9" 1248
"C12N9" 1248
"C12Y304" 1271
"Y10S514" 1271
"A61K 38/00" 1271
"C12Y304" 1271
end
[/CODE]

Thanks in advance

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Combining observations
Combining observations

0 Response to Combining observations

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Combining observations Combining observations

Related Posts with Combining observations

0 Response to Combining observations