I have lots of zeros in both my dependent and independent variables.
One way that I was dealing with this is by adding 1 to all of the values. However, this makes each of the variables right-skewed. So I took the natural log to create a normal distribution. But when I do this I get a spike to the left followed by a normal distribution (see an example below). As I believe this violates the assumption of normal distribution, I tried dropping the zeros which reduces the sample size too much and then I don't get significance in my models. I read that I could impute the zero values with the mean, but I know that would misrepresent my data. I also read that I could take the square root instead of the log for transformation, but the data is still right skewed rather than having a normal distraction. Any other thoughts on how I might deal with this issue would be much appreciated!
Related Posts with Dealing with zeros
Question about 2SLSDear all, I have a problem with 2SLS. Now I have one endogenous variable and I find one instrument v…
Multilevel multinomial logistic regression in gsemDear Stata Users,
I have individual defendants data nested within courts and four possible outcomes…
ivreg2 versus twostepweakiv: how to choose?Hi everyone!
I working on a problem using linear IV models. The main specification has one endogeno…
Parallelising tasksHi all,
maybe this is a naive question. However, let's say I have either a for loop or a collapse l…
Simple data transformationHello!
I am relatively new to Stata, and I am stuck on a hopefully easy-to-solve issue.
In my data…
Subscribe to:
Post Comments (Atom)
0 Response to Dealing with zeros
Post a Comment