Hello all,
I am a new STATA user and have some questions about winsorizing.
For example, I want to winsorize variable a with 20 observations at 5% and 95% percentile: -40 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 1053 (-40 and 1053 are outliers given 5% and 95% percentile)
Code winsor a, gen(a_w) p(0.05) gives me: -5 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 101
and code winsor2 a, suffix(_w2) cuts(5 95) gives me: -22.5 -5 10 13 15 19 26 28 41 58 78 85 86 89 89 91 92 101 101 577
Base on my understanding, both codes should perform the same task. So why the results are different? Which one is correct?
Another more general question, if one wants to winsorize a string of data such as 1 2 3 4 ...98 99 100 at 1% and 99% percentile, what is the correct result? Should it be 2 2 3 4....98 99 99?
Han
Related Posts with winsor and winsor2, different results
Is beta regression model best for panel data type of data?Hello everyone, I am a newly registered member here, but I have been using this site already for th…
Changing structure of Data setI have an excel data set in the format having first Column as Date, and following Columns have daily…
What is the base category when two variables will interact with year dummy?Dear Stata User, I have one continuous independent variable (i.e., hhi), and other two variables d_n…
endogeneityDear researchers, I am using the generalized DID model “Two way-fixed effect”. I have more than two …
Checking merges over a very large number of filesMy apologies in advance if this has been covered - I have looked but haven't found anything quite th…
Subscribe to:
Post Comments (Atom)
0 Response to winsor and winsor2, different results
Post a Comment