Hello,
I was wondering when it is best to clean the data (i.e. delete missing or negative values) when performing different types of regression analysis?
If I want to first perform a linear regression, and afterwards a multiple regression; should I delete all the negative values of the variables (i.e. keep only VAR>=0) I want to use at the beginning/ before running both regressions, or should I only delete the missing data which will be used for THE particular regression?
I would think that the first option is better, since the same amount of observations will remain for each type of regression.
Otherwise, the linear regression could be based on for instance 20,000 observations, and the multiple regression based on 14,000 observations...
Can someone confirm this?
Thanks in advance!
Related Posts with WHEN to delete missing/negative values?
Time Trends for Panel DataHello, I am writing a descriptive paper on substance use estimates across 4 years (2013-2017) using…
New version of listtab on SSCThanks once again to Kit Baum, a new version of the listtab package is now available for download fr…
regression effect sizes after using mi estimateHi STATA listers, I am using mi estimate to complete a multiple regression analysis. I would like t…
Condition application logistic regressionsHi everyone, I was wondering what the conditions of application of a logistic regression are and how…
New version of invdesc on SSCThanks as always to Kit Baum, a new version of the invdesc package is now available for download fro…
Subscribe to:
Post Comments (Atom)
0 Response to WHEN to delete missing/negative values?
Post a Comment