Hello all,
I am a new STATA user and have some questions about winsorizing.
For example, I want to winsorize variable a with 20 observations at 5% and 95% percentile: -40 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 1053 (-40 and 1053 are outliers given 5% and 95% percentile)
Code winsor a, gen(a_w) p(0.05) gives me: -5 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 101
and code winsor2 a, suffix(_w2) cuts(5 95) gives me: -22.5 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 577
Base on my understanding, both codes should perform the same task. So why the results are different? Which one is correct?
Another more general question, if one wants to winsorize a string of data such as 1 2 3 4 ...98 99 100 at 1% and 99% percentile, what is the correct result? Should it be 2 2 3 4....98 99 99?
Han
Related Posts with winsor and winsor2, different results
Replacing missing values in a panel dataHello, I am doing a difference in difference in panel data set, years 2001-2015. The observation un…
multiple imputation (-mi-) with the synthetic control method (-synth-)Has anyone combined multiple imputation with the synthetic control method (https://fmwww.bc.edu/repe…
Rearranging Data Code: clear input str25 sub1 marks1obt marks1max str30 sub2 marks2obt marks2max str30 sub3 marks3ob…
P-values and CI's for dotplot?Hi you helpful people. I have measured a concentration of a medical drug 1 week after operation in …
Panelregression using pooled ols with dummiesHello all, I have a paneldataset ranging from 2002 until 2019 containing ~ 500 companies. My paneli…
Subscribe to:
Post Comments (Atom)
0 Response to winsor and winsor2, different results
Post a Comment