BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

winsor and winsor2, different results
winsor and winsor2, different results

Hello all,

I am a new STATA user and have some questions about winsorizing.

For example, I want to winsorize variable a with 20 observations at 5% and 95% percentile: -40 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 1053 (-40 and 1053 are outliers given 5% and 95% percentile)

Code winsor a, gen(a_w) p(0.05) gives me: -5 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 101

and code winsor2 a, suffix(_w2) cuts(5 95) gives me: -22.5 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 577

Base on my understanding, both codes should perform the same task. So why the results are different? Which one is correct?

Another more general question, if one wants to winsorize a string of data such as 1 2 3 4 ...98 99 100 at 1% and 99% percentile, what is the correct result? Should it be 2 2 3 4....98 99 99?

Han

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / winsor and winsor2, different results
winsor and winsor2, different results

0 Response to winsor and winsor2, different results

Post a Comment

Home / Data Cleaning / Data management / Data Processing / winsor and winsor2, different results winsor and winsor2, different results

Related Posts with winsor and winsor2, different results

0 Response to winsor and winsor2, different results