Hi all, I have a very large dataset of 970,000 observations, this dataset was given to be an organisation.
I tried to merge this dataset with another which came back with the error
stata does not uniquely identify observations in the master data
Which I figured it it has to do with my ID variable. I checked for any missing in both the master and merge file which there are none.
I then checked for duplicates as I figured out this would be the only other reason. (Although in none of my code have I myself introduced any duplicates)
I tried duplicates report
Array
I then tried to list the duplicates of course there were too many.
I then tried codebook - as you can see the unique values here differ.
Array
My question: Why does codebook show different number of unique values to the duplicates report which shows there are 959,798 unique values.
Related Posts with Duplicates report vs Codebook
Using Stata to delete files from foldersDear All, This may come as an odd request however, I am trying to understand whether the following …
Time series dataI have data on Bitcoin price and S&P500 price development from Oct. 6th 2020 - Dec. 27th 2013, i…
Expanding dataset to create pairwise combinationsHi All, My dataset is as follows: Code: * Example generated by -dataex-. To install: ssc install…
How to construct variables based on items from a survey using scaleHi, I have a dataset from a questionnaire with 498 observations. There is item nonresponse present, …
Problem with generating counts of unique episodes within an ID based on the specific ruleHi everyone, I have data on patients receiving certain medication captured as 9-months episodes. I …
Subscribe to:
Post Comments (Atom)
0 Response to Duplicates report vs Codebook
Post a Comment