Hello,
I was wondering when it is best to clean the data (i.e. delete missing or negative values) when performing different types of regression analysis?
If I want to first perform a linear regression, and afterwards a multiple regression; should I delete all the negative values of the variables (i.e. keep only VAR>=0) I want to use at the beginning/ before running both regressions, or should I only delete the missing data which will be used for THE particular regression?
I would think that the first option is better, since the same amount of observations will remain for each type of regression.
Otherwise, the linear regression could be based on for instance 20,000 observations, and the multiple regression based on 14,000 observations...
Can someone confirm this?
Thanks in advance!
Related Posts with WHEN to delete missing/negative values?
Minutes off in datetime formattingHello, I am working on a load profile with 30-minute invervals of electricity demand that I wish to…
Combine cases, but keep values from each.I'd like to combine cases, but in an unusual way -- by keeping values from each. For example, start …
Reverse coefficients after regression on ihs transformed netwealthDear Stata Forum, I am doing regressions with dependent var netwealth transformed with the hyperbol…
Creating a binary varible from a date variableI would like to create a variable where there is only two variables (0=heating, 1=non-heating) I ha…
How to execute a Matlab script from StataHi all, I'd like to run a Matlab script from within my Stata Do-File. Haven't done this before, but…
Subscribe to:
Post Comments (Atom)
0 Response to WHEN to delete missing/negative values?
Post a Comment