Dear Statalisters,
I have a dataset with three (quite long) string variables and other (numeric and string) variables. In total I have roughly 100,000 observations. I want to drop duplicates in terms of the three string variables and disregard the other variables in this selection. Thus, I write drop duplicates var1 var2 var3, force
However, the resulting dataset varies in the number of observations it contains by quite a lot. For example, I run the code and get 39,367 observations. Then I immediately rerun the code and get 39,394 observations.
I know it is hard to tell from far away what´s going on. But does anyone might have an insight into the problem or encountered that before? Does duplicates drop get confused when there are a lot of variables? Or is there another problem?
Thank you very much!
All the best
Leon
Related Posts with Duplicates Drop Force Yields Different Results for Same Dataset
generate new variable using egen with sum/countHello everyone, hope you all have a good day ahead.
So i want to ask 1 question.. i have a data set…
One specific year cannot be merged even though merging variables have the same values in master and using dataHi everyone,
I am working with two panel datasets with which I am trying to perform a 1:1 merge. Ul…
Interaction terms to check if variables are interdependent?I have the potentially endogenous variables Loan size and the loan rate. Loan rate is my dependent v…
Interpreting my results with variables for Age CategoriesI am running the following fixed effects regression
xtreg recycling loginc logpopden age1120 age213…
Graph axes with same range?I'm a new user in Stata and want both of the axes in a scatter plot to span the same range without n…
Subscribe to:
Post Comments (Atom)
0 Response to Duplicates Drop Force Yields Different Results for Same Dataset
Post a Comment