Normally, I exclude a covariate out of the regression equation if there are many missing observation, let's say it is one-fifth less observation compared to other variables' observation in general (saying 80,000 compared to 100,000).

But now, when reflecting back, I am wondering if there is any reference or explanation for excluding action like that? I think it may relate to the within-sample standard variation and explanation power due to the sample shrinking but I am not sure about that.