I have some data from the labour force survey and I'm estimating wage regression. I want to include degree classification into the wage regression. However, I have noticed that the number of observations in the regression drops when I include this variable, since not everyone has done a degree and hence there is missing data. How do I overcome this sample selection issue? I understand I shouldn't create dummy variables and replace the missing data into 0's because this would skew the results. But I am unsure how to deal with it. Any help?