Good evening,

I have been reading the literature for estimating two way fixed effects models, and it is a bit of a mess, I could not find a good survey/synthesis of what works and what does not work -- everybody seems to be doing his own thing.

I want to check what I am getting wrong about the following reasoning, and any contributions are welcome, if Professor Jeff Wooldridge and Sergio Correia can look into the matter I would appreciate their input a lot.

My reasoning is that in the standard/canonical two-way fixed effects model with cross sectional id and time series t, the id fixed effects are orthogonal to the t fixed effects.

This is because the cross sectional id dummies vary only across id units (but not across time), and the time dummies vary only across time (but not across ids).

Therefore I reason that as the id dummies are orthogonal to the time dummies, I should be able to obtain the two-way fixed effects by a simple two stage procedure:

1. Residualise every variable in the regression, using say -areg-, with respect of the first fixed effect.

2. Use the residualised variables in a second -areg- regression absorbing the second fixed effects, where the regressors are the residualised variables from step 1.

What I think should work, almost work, but not exactly. So my question is why is this not exactly working? Do I have some error in the reasoning? Or is it a numerical issue?

Here is an illustration. Lets say I want to fit a two-way fixed effects regression using the nlswork data, where the dependent variable is ln_wage and the regressors are age and hours, and id is idcode and time is year:

1) Step one, I residualise all the variables with respect of one of the fixed effects, idcode in this example

Code:
. webuse nlswork, clear
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. qui foreach var of varlist ln_wage age hours {
. areg `var', absorb(idcode)
. predict double `var'res, resid
. }
2) Step two, I use the residualised variables as regressors now absorbing the second fixed effect:

Code:
. areg  ln_wageres ageres hoursres, absorb(year)

Linear regression, absorbing indicators         Number of obs     =     28,443
Absorbed variable: year                         No. of categories =         15
                                                F(   2,  28426)   =     579.90
                                                Prob > F          =     0.0000
                                                R-squared         =     0.1063
                                                Adj R-squared     =     0.1058
                                                Root MSE          =     0.2769

------------------------------------------------------------------------------
  ln_wageres |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      ageres |   .0182627    .000538    33.94   0.000     .0172082    .0193172
    hoursres |   .0007479   .0002189     3.42   0.001     .0003188    .0011769
       _cons |   .0000474   .0016419     0.03   0.977    -.0031708    .0032656
------------------------------------------------------------------------------
F test of absorbed indicators: F(14, 28426) = 6.860           Prob > F = 0.000
and they are almost, but not exactly what they should be:

Code:
. reghdfe ln_wage age hours, absorb(idcode year)
(dropped 554 singleton observations)
(MWFE estimator converged in 8 iterations)

HDFE Linear regression                            Number of obs   =     27,889
Absorbing 2 HDFE groups                           F(   2,  23718) =       5.75
                                                  Prob > F        =     0.0032
                                                  R-squared       =     0.6554
                                                  Adj R-squared   =     0.5948
                                                  Within R-sq.    =     0.0005
                                                  Root MSE        =     0.3030

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0128924   .0102256     1.26   0.207    -.0071505    .0329353
       hours |    .000756   .0002397     3.15   0.002     .0002862    .0012258
       _cons |   1.275635   .2973869     4.29   0.000      .692738    1.858533
------------------------------------------------------------------------------


. areg ln_wage age hours i.year, absorb(idcode)

Linear regression, absorbing indicators         Number of obs     =     28,443
Absorbed variable: idcode                       No. of categories =      4,709
                                                F(  16,  23718)   =     177.07
                                                Prob > F          =     0.0000
                                                R-squared         =     0.6648
                                                Adj R-squared     =     0.5980
                                                Root MSE          =     0.3030

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0128924   .0102256     1.26   0.207    -.0071505    .0329353
       hours |    .000756   .0002397     3.15   0.002     .0002862    .0012258
             |
        year |
         69  |   .0741072   .0159121     4.66   0.000     .0429185    .1052959
         70  |   .0475207   .0235895     2.01   0.044     .0012839    .0937576
         71  |   .0860412   .0328227     2.62   0.009     .0217066    .1503757
         72  |   .0856385   .0425291     2.01   0.044     .0022789    .1689982
         73  |   .0875726   .0523902     1.67   0.095    -.0151154    .1902607
         75  |   .0765988    .072101     1.06   0.288    -.0647237    .2179213
         77  |   .1071687   .0923094     1.16   0.246    -.0737636     .288101
         78  |   .1293613   .1029087     1.26   0.209    -.0723463    .3310689
         80  |   .1119272   .1229027     0.91   0.362    -.1289699    .3528242
         82  |   .1075358   .1432406     0.75   0.453     -.173225    .3882966
         83  |   .1190697   .1533399     0.78   0.437    -.1814863    .4196257
         85  |   .1429657   .1737723     0.82   0.411     -.197639    .4835705
         87  |   .1339107   .1942917     0.69   0.491    -.2469135    .5147349
         88  |   .1745405   .2081783     0.84   0.402    -.2335022    .5825832
             |
       _cons |   1.169893   .1956564     5.98   0.000     .7863941    1.553392
------------------------------------------------------------------------------
F test of absorbed indicators: F(4708, 23718) = 8.643         Prob > F = 0.000
And the results are not the same as in my two step procedure...

So does anybody see where I go wrong?