Hello!
I'm running a set of regressions using Employer-Employee Data (LEED). I'm a running a OLS, worker fixed effects, firm fixed effects and finally worker and firm fixed effects. I want to ensure that the observations used is the same across all models. In a normal situation, I would solve the problem with the following example:

Code:
webuse nlswork, clear
xtset idcode

reg ln_w grade age ttl_exp tenure not_smsa south
    gen s1 = (e(sample))

xtreg ln_w grade age ttl_exp tenure not_smsa south, fe
    gen s2 = (e(sample))
    
reghdfe ln_w grade age ttl_exp tenure not_smsa south, absorb(idcode)    
    gen s3 = (e(sample))
    
reg ln_w grade age ttl_exp tenure not_smsa south if s1 == 1 & s2 == 1 & s3 ==1
    est store m1

xtreg ln_w grade age ttl_exp tenure not_smsa south if s1 == 1 & s2 == 1 & s3 ==1, fe
    est store m2
    
reghdfe ln_w grade age ttl_exp tenure not_smsa south if s1 == 1 & s2 == 1 & s3 ==1, absorb(idcode)    
    est store m3
    
    esttab m1 m2 m3
However, this implies running all the models first, which I'm trying to avoid because of the time it will take to run the reghdfe command using millions of observations.

A solution I've attempted was to ensure that none of my variables had missing values and that my each id would appear at least twice so that FE can be estimated. Example:

Code:
webuse nlswork, clear
xtset idcode

egen miss = rowmiss(ln_w grade age ttl_exp tenure not_smsa south)
egen n_all = count(idcode), by(idcode)

reg ln_w grade age ttl_exp tenure not_smsa south if n_all > 1 & miss == 0
    est store m1
    
xtreg ln_w grade age ttl_exp tenure not_smsa south if n_all > 1 & miss == 0, fe
    est store m2
    
reghdfe ln_w grade age ttl_exp tenure not_smsa south if n_all > 1 & miss == 0, absorb(idcode)    
    est store m3
    
    esttab m1 m2 m3
However, when I run this second example, the 3rd column has 17 less observations than the rest due to singleton observations which I don't understand the cause.

Thank you in advance for you help!
Hélder