Hello,
I was wondering when it is best to clean the data (i.e. delete missing or negative values) when performing different types of regression analysis?
If I want to first perform a linear regression, and afterwards a multiple regression; should I delete all the negative values of the variables (i.e. keep only VAR>=0) I want to use at the beginning/ before running both regressions, or should I only delete the missing data which will be used for THE particular regression?
I would think that the first option is better, since the same amount of observations will remain for each type of regression.
Otherwise, the linear regression could be based on for instance 20,000 observations, and the multiple regression based on 14,000 observations...
Can someone confirm this?
Thanks in advance!
Related Posts with WHEN to delete missing/negative values?
change in sign and significance of linear term after adding quadratic and cubic termsHi all, I need your comments on change in sign and significance of linear term after adding quadrat…
Stacked/Bar Graphs for Multiple Categorical VariablesHello, I am examining socio-demographic differences in attitudes towards FGC practice. Here is an e…
Npregress slow with large data-sets, small samplesHi there, Stata brethren. Recently I have been trying to use the new nonparametric regression featu…
Interaction without main effects-should it be significant too?Should an interaction without main effects be significant if the interaction term is significant in …
Creating a variable: TRUE or FALSE based on date within two other dates for each participant of of which there are multiple ID valuesHello, my apologies for not using dataex. I am using healthcare data so I have created an example be…
Subscribe to:
Post Comments (Atom)
0 Response to WHEN to delete missing/negative values?
Post a Comment