Hello,

I want to perform a linear (1 independent variable) and a multiple regression with a sample of 30,000 observations.
Let's say that I can use the total sample for the linear regression, but before running the multiple regression I only want to keep the positive values of the additional independent variables before running this multiple regression (by using 'keep if var1>=0, var2>=0', etc.). This reduces the number of observations to 16,000 observations. Am I still able to discuss both regressions in an unbiased way or do I also need to use the same smaller sample of 16,000 obs for the linear regression?

Ps I have a good reason for removing the negative values as these are irrelevant for my research.

Thanks!