Dear members,

I have a binary dependent variable and want to estimate either a logistic regression or a LPM. My key explanatory variable (a measure of exposure to specific media content) has many zeros, some medium values and few extremely high values. See this summary statistic of the explanatory variable as an example:

Code:
      Percentiles      Smallest
 1%            0              0
 5%            0              0
10%            0              0       Obs              21,169
25%            0              0       Sum of Wgt.      21,169

50%            0                      Mean           .3175061
                        Largest       Std. Dev.      1.190602
75%            0             21
90%     .8571429             23       Variance       1.417533
95%            2       25.28572       Skewness       8.337951
99%     5.428571       31.57143       Kurtosis       110.2057
As you see the maximum value is more than 20 times the standard deviation of the variable. Because there are many zeros I cannot log-transform the variable. Do you have any ideas in how far this is could be a probem? Should I use a transformation to the variable? Does it affect the choice between LPM and logistic model? E.g. is logistic regression more robust to skewed distriubtions?