BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Duplicates report vs Codebook
Duplicates report vs Codebook

Hi all, I have a very large dataset of 970,000 observations, this dataset was given to be an organisation.

I tried to merge this dataset with another which came back with the error

stata does not uniquely identify observations in the master data

Which I figured it it has to do with my ID variable. I checked for any missing in both the master and merge file which there are none.

I then checked for duplicates as I figured out this would be the only other reason. (Although in none of my code have I myself introduced any duplicates)

I tried duplicates report

Array

I then tried to list the duplicates of course there were too many.

I then tried codebook - as you can see the unique values here differ.

Array

My question: Why does codebook show different number of unique values to the duplicates report which shows there are 959,798 unique values.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Duplicates report vs Codebook
Duplicates report vs Codebook

0 Response to Duplicates report vs Codebook

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Duplicates report vs Codebook Duplicates report vs Codebook

Related Posts with Duplicates report vs Codebook

0 Response to Duplicates report vs Codebook