Dear Statalisters,
I have a dataset with three (quite long) string variables and other (numeric and string) variables. In total I have roughly 100,000 observations. I want to drop duplicates in terms of the three string variables and disregard the other variables in this selection. Thus, I write drop duplicates var1 var2 var3, force
However, the resulting dataset varies in the number of observations it contains by quite a lot. For example, I run the code and get 39,367 observations. Then I immediately rerun the code and get 39,394 observations.
I know it is hard to tell from far away what´s going on. But does anyone might have an insight into the problem or encountered that before? Does duplicates drop get confused when there are a lot of variables? Or is there another problem?
Thank you very much!
All the best
Leon
Related Posts with Duplicates Drop Force Yields Different Results for Same Dataset
On the regular expression of StataI found regular expression of Stata very confusing. For instance: Code: disp regexm("010-11223344",…
Suppressing Overall marker in Metanalysis forest plotsHi David. I am new to stata and woud like to plot forest plots for of hazard ratios, but would like …
TestTesting …
Generating a new variable that contains only some values of other variablesDear all, I have a set of variables (i.e., v1 to V) from which I would like to extract some single …
Summing up individual-specific amount of variablesHi, I am trying to generate a household-specific monetary policy shock variable. Generally I have tw…
Subscribe to:
Post Comments (Atom)
0 Response to Duplicates Drop Force Yields Different Results for Same Dataset
Post a Comment