Dear Statalisters,
I have a dataset with three (quite long) string variables and other (numeric and string) variables. In total I have roughly 100,000 observations. I want to drop duplicates in terms of the three string variables and disregard the other variables in this selection. Thus, I write drop duplicates var1 var2 var3, force
However, the resulting dataset varies in the number of observations it contains by quite a lot. For example, I run the code and get 39,367 observations. Then I immediately rerun the code and get 39,394 observations.
I know it is hard to tell from far away what´s going on. But does anyone might have an insight into the problem or encountered that before? Does duplicates drop get confused when there are a lot of variables? Or is there another problem?
Thank you very much!
All the best
Leon
Related Posts with Duplicates Drop Force Yields Different Results for Same Dataset
reg3 and xtabond2 Dear Statalist, I have estimated the 3SLS model below. reg3 (eq1: y1 = x1 x2 x3 x4 x5 dum1 dum2 d…
How to detect user-written programs in do-files?Is there a reasonable way to detect user-written commands in Stata code? I lead a small (5-person, …
Sample-level margin of error account for DeffHello, Apologies for the potentially basic question—I am generally a qualitative research who is di…
Calculating the difference of each variables with all othersHi, I would like to calculate the interest rate differential of each currency with all others. I ne…
Remove margin below X-titleHi Statalist, I need to embed a local polynomial graph inside a LaTeX document. In order to keep th…
Subscribe to:
Post Comments (Atom)
0 Response to Duplicates Drop Force Yields Different Results for Same Dataset
Post a Comment