[Note: PhD student new to Stata and still somewhat of a beginner with stats analysis]
In my dataset, the variable 'Inputs' reflects monetary values for which some observations are 0. I have logged all values of 'Inputs' for running regressions, but of course Stata drops +/- 25 observations for which 'Inputs' =0. I would prefer not to lose those observations because my sample is only n=147.
On the advice of my supervisor, I have replaced 'Inputs'=0 with 'Inputs'=1 for the latter observations so as not to drop them from the sample, then I logged the values again. Now instead of dropping those observations, they remain in the sample with 'Log_Inputs'=0. However, this weakens the R-squared value and therefore the model.
Which is the better choice: Drop the observations that cannot be logged, or weaken the model but maintain the sample size?
0 Response to Logged variables w/value of 0 : drop observations or weaken the model?
Post a Comment