Dear Statalisters,
I have a dataset with three (quite long) string variables and other (numeric and string) variables. In total I have roughly 100,000 observations. I want to drop duplicates in terms of the three string variables and disregard the other variables in this selection. Thus, I write drop duplicates var1 var2 var3, force
However, the resulting dataset varies in the number of observations it contains by quite a lot. For example, I run the code and get 39,367 observations. Then I immediately rerun the code and get 39,394 observations.
I know it is hard to tell from far away what´s going on. But does anyone might have an insight into the problem or encountered that before? Does duplicates drop get confused when there are a lot of variables? Or is there another problem?
Thank you very much!
All the best
Leon
Related Posts with Duplicates Drop Force Yields Different Results for Same Dataset
Use Log(Stock Prices) or (Stock Returns)I am running a regression model to find the impact of Federal Funds Rate(FFR) on Stock market. I hav…
How to use a .prn file in stataHey there, I am currently working on my bachelor thesis about economic growth and I downloaded a da…
Panel threshold regressionDear all, I am trying to do panel threshold regression developed by Hansen(1999). Our dependent vari…
Fixed Effects & Ordered Probit (RE) for same panel?Hey Folks! I hope I am not getting too annoying with my questions. I have encountered the following…
Dissimilarity Index for Neighborhood Level DataHello, I need to calculate a racial dissimilarity index score for each of 52 neighborhoods in my st…
Subscribe to:
Post Comments (Atom)
0 Response to Duplicates Drop Force Yields Different Results for Same Dataset
Post a Comment