Dear All,

I am working on research to identify the influential factors of growth ambitions. I have a continuous dependent variable named growth aspirations which measure the number of jobs created in five years. This variable is right-skewed (12.3) so I would like to take the natural logarithm to be more normally distributed. However, most of the values take the value 0 which results in missing data if I transform it. Previous authors calculate entrepreneurs’ growth aspirations as the difference between (the natural logarithms of) the entrepreneurs expected number of employees in the next 5 years and the actual number of employees, exclusive of owners, at the firm’s inception. But if I take the log(Growthaspirations-actual numbers of employees) it also results in missing data as the difference takes a negative value. I have 90 thousand observations and 60-70% would be lost if I transform it to log. I am using the same database as those authors and their papers were published in highly recognized papers so I am sure they did the right thing.

Can anyone help me to solve this issue?

Thank you so much!