Dear Statalists,
I hope you are well. I would like to ask you please about the process of using the code winsor2 to clean the dataset from the outlier issue. I have tried with the following steps with a number of variables but the variables have not changed- as shown in the examples.
Example (1)
clonevar PO_ST_W = PO_GEN
su PO_GEN_W , d
winsor2 P_GEN_W , replace cuts(1 99)
replace P_GEN_W =r(p99) if PO_GEN_W >=r(p99) & PO_GEN_W <.
replace P_GEN_W =r(p1) if PO_GEN_W >=r(p1) & PO_GEN_W <.
. replace PO_GEN_W =r(p1) if PO_GEN_W >=r(p1) & PO_GEN_W >.
(0 real changes made)
. replace PO_GEN_W =r(p99) if PO_GEN_W >=r(p99) & PO_GEN_W <.
(0 real changes made)
Example (2)
clonevar PO_ST_W = PO_GEN
su R_ST_W , d
winsor2 R_ST_W , replace cuts(1 99)
replace R_ST_W =r(p99) if R_ST_W >=r(p99) & R_ST_W <.
replace R_ST_W =r(p1) if R_ST_W >=r(p1) & R_ST_W <.
. replace R_ST_W =r(p1) if R_ST_W >=r(p1) & R_ST_W >.
(0 real changes made)
. replace R_ST_W =r(p99) if R_ST_W >=r(p99) & R_ST_W <.
(0 real changes made)
su R_ST_W, d
Level of satisfaction
Percentiles Smallest
1% 0 0
5% 0 0
10% .5 0 Obs 300
25% 1.5 0 Sum of Wgt. 300
50% 2 Mean 1.65
Largest Std. Dev. .6549273
75% 2 2
90% 2 2 Variance .4289298
95% 2 2 Skewness -1.63945
99% 2 2 Kurtosis 4.263773
I have attached here a sample of a graph box that shows the existence of the outlier in one of the variables.
probit Sksupprt i.FST_EXP i.FST_B i.FST_GW i.FST_AD i.FST_ADV i.R_LN i.R_ST_W i.PO_GEN i.PO_CIT i.PO_EP i.PO_EC i.FA_SE i.FA_AE i.FA_SI
My variables are dummy and categorical variables coded the former as01 and the later start wit 0, 1, 2, ... for 300 observations.
Could you please help on how to apply winsorize2 for the variables that have outliers? and why I am getting no changes made a result?
Many thanks for your continuous help
Kind Regards,
Rabab
Related Posts with Have question about winsor2 procedures
Creating a variable that captures if monthly salary is twice as big as the previous month and employment stays the sameHi I want to create a variable that is 1 if the monthly salary is twice as big or more than the pre…
Estimating working hours in a simultaneous equations modelHello! What would be the best way to estimate hours spent on different activities (censored, values …
SUR regression (sureg) for comparing variation accross countries not working - "no observations r(2000)"Hi, I am trying to estimate a simply model for multiple countries using seemingly unrelated regressi…
Drop observations if two of the variables are identicalI have a dataset similar to the example below, where the index number is always the same for when Va…
control group selection for DiDHello! I would like to estimate the effect of a flood event on land prices. I have decided to use a …
Subscribe to:
Post Comments (Atom)
0 Response to Have question about winsor2 procedures
Post a Comment