I have lots of zeros in both my dependent and independent variables.
One way that I was dealing with this is by adding 1 to all of the values. However, this makes each of the variables right-skewed. So I took the natural log to create a normal distribution. But when I do this I get a spike to the left followed by a normal distribution (see an example below). As I believe this violates the assumption of normal distribution, I tried dropping the zeros which reduces the sample size too much and then I don't get significance in my models. I read that I could impute the zero values with the mean, but I know that would misrepresent my data. I also read that I could take the square root instead of the log for transformation, but the data is still right skewed rather than having a normal distraction. Any other thoughts on how I might deal with this issue would be much appreciated!
Related Posts with Dealing with zeros
Coding a variable that takes account of previous observationsHi Statalist. I have created a categorical variable that represents the marital status of all coupl…
How to generate group idHi, I am importing a household data from excel, and data set does not have a proper id variable, and…
Combining 2 coefplots (based on margins from linear regression models) to have a panel graphHi All I've produced the below two graphs using coefplot (graphs are based on margins produced afte…
Interpreting a regressor in probit modelI am running a probit model to assess the determinants of private tutoring participation. One of the…
Cluster standard errors by firm and yearHello all, I am running a FE regression using the LSDV approach (industry and year FE) and i want t…
Subscribe to:
Post Comments (Atom)
0 Response to Dealing with zeros
Post a Comment