Hi all, I have a very large dataset of 970,000 observations, this dataset was given to be an organisation.
I tried to merge this dataset with another which came back with the error
stata does not uniquely identify observations in the master data
Which I figured it it has to do with my ID variable. I checked for any missing in both the master and merge file which there are none.
I then checked for duplicates as I figured out this would be the only other reason. (Although in none of my code have I myself introduced any duplicates)
I tried duplicates report
Array
I then tried to list the duplicates of course there were too many.
I then tried codebook - as you can see the unique values here differ.
Array
My question: Why does codebook show different number of unique values to the duplicates report which shows there are 959,798 unique values.
Related Posts with Duplicates report vs Codebook
asking for commands in creating these variables in StataHi. I want to ask for help in creating a certain variable. The data looks like this. It contains le…
Changing Variables to Same UnitsHi everyone, I have two variables measured in units of 'days', i.e., for how many days an individua…
Which tests are used to calculate the individual P-values in the STATA logit command (binary logistic regression)?This picture is from page 1290 in the STATA manual: Array The model can be made with the following…
Exporting Stata output to excelDear All, I have a panel dataset. I calculate the mean of each variable for each cross-section unit…
How to code treatment intensity and months of exposure to treatment for pre-treatment rounds?Hi, I am working on a difference in difference analysis based on a panel data of 5 rounds. The trea…
Subscribe to:
Post Comments (Atom)
0 Response to Duplicates report vs Codebook
Post a Comment