I have lots of zeros in both my dependent and independent variables.
One way that I was dealing with this is by adding 1 to all of the values. However, this makes each of the variables right-skewed. So I took the natural log to create a normal distribution. But when I do this I get a spike to the left followed by a normal distribution (see an example below). As I believe this violates the assumption of normal distribution, I tried dropping the zeros which reduces the sample size too much and then I don't get significance in my models. I read that I could impute the zero values with the mean, but I know that would misrepresent my data. I also read that I could take the square root instead of the log for transformation, but the data is still right skewed rather than having a normal distraction. Any other thoughts on how I might deal with this issue would be much appreciated!
Related Posts with Dealing with zeros
Help to define cases in longitudinal dataHi, I would like to apologise as I am unable to post the data sample here for you all, as the datas…
Customizing value labels with encode or related commandsHello: I am trying to encode str var into an encoded var_code, which is typically a simple command..…
t test output in matrix using loopsHello I am struggling to store the results of ttest in matrix using loops. I have stname (string va…
in REML, how to get random coefficients Hi. I want to get the random coefficients(b_ijt) of year dummy variables(D_t) from the equation be…
Tabulate the same variable multiple timesDear all, I need to tabulate variable Y conditioned on different values another variable X can take…
Subscribe to:
Post Comments (Atom)
0 Response to Dealing with zeros
Post a Comment