Hello everyone,
I am currently working with Panel Data (firm, year) for my seminar paper and I am in the phase of preparing the data for the analysis. My problem now is as followed:
I generated the variable CETR (Cash Effective Tax Rate) with the command
gen CETR = CF_TAXATION / PRETAX_INCOME
The results included some negative values, some values larger than 1 and also missing values.
Now, in an effort to control for outliers I wanted to winsorize the values for CETR to 0 and 1, i.e. if CETR has a value >1 it should be defined as 1 and if CETR<0 it should be defined as 0.
replace CETR=0 if CETR<0
replace CETR=1 if CETR>1
After looking at the results, I observed that Stata now assigned the value 1 to originally missing data of CETR, because Stata treats missing values as positive infinity. Since I have a significant amount of missing data this biases my results substantially. So my question is therefore, how do I have to alter the previous commands to prevent such a biased result or i.e. how do I tell Stata to keep missing values missing in such a setting?
Thanks in advance and kind regards,
Lucas
Related Posts with How do I keep "missing values" missing?
Need help with dataset with multiple datesHi I am working with a dataset which ahs multiple dates for the same diagnosis for a patient and I o…
How to change a line pattern within a same graph? Hi, I would like to plot a graph using Code: twoway line and draw a different graph pattern after …
linking data with a keyHello I need some help linking this data set So the variable taskdefinitionid is the numerical cod…
Need help re-arranging dataHi my dataset looks like this right now: clear input long mrn str10 date str11 diagnosis long date_n…
Panel data: xtoverid rejects RE for model with only time controls; odd result or information?Dear Statalisters, I am using interrupted time series methods on household panel data. I have month…
Subscribe to:
Post Comments (Atom)
0 Response to How do I keep "missing values" missing?
Post a Comment