Dear Statalisters,

I have a dataset with three (quite long) string variables and other (numeric and string) variables. In total I have roughly 100,000 observations. I want to drop duplicates in terms of the three string variables and disregard the other variables in this selection. Thus, I write drop duplicates var1 var2 var3, force

However, the resulting dataset varies in the number of observations it contains by quite a lot. For example, I run the code and get 39,367 observations. Then I immediately rerun the code and get 39,394 observations.

I know it is hard to tell from far away what´s going on. But does anyone might have an insight into the problem or encountered that before? Does duplicates drop get confused when there are a lot of variables? Or is there another problem?
Thank you very much!

All the best
Leon