BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

Duplicates Drop Force Yields Different Results for Same Dataset
Duplicates Drop Force Yields Different Results for Same Dataset

Dear Statalisters,

I have a dataset with three (quite long) string variables and other (numeric and string) variables. In total I have roughly 100,000 observations. I want to drop duplicates in terms of the three string variables and disregard the other variables in this selection. Thus, I write drop duplicates var1 var2 var3, force

However, the resulting dataset varies in the number of observations it contains by quite a lot. For example, I run the code and get 39,367 observations. Then I immediately rerun the code and get 39,394 observations.

I know it is hard to tell from far away what´s going on. But does anyone might have an insight into the problem or encountered that before? Does duplicates drop get confused when there are a lot of variables? Or is there another problem?
Thank you very much!

All the best
Leon

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Duplicates Drop Force Yields Different Results for Same Dataset
Duplicates Drop Force Yields Different Results for Same Dataset

0 Response to Duplicates Drop Force Yields Different Results for Same Dataset

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Duplicates Drop Force Yields Different Results for Same Dataset Duplicates Drop Force Yields Different Results for Same Dataset

Related Posts with Duplicates Drop Force Yields Different Results for Same Dataset

0 Response to Duplicates Drop Force Yields Different Results for Same Dataset