I have been reading the literature for estimating two way fixed effects models, and it is a bit of a mess, I could not find a good survey/synthesis of what works and what does not work -- everybody seems to be doing his own thing.
I want to check what I am getting wrong about the following reasoning, and any contributions are welcome, if Professor Jeff Wooldridge and Sergio Correia can look into the matter I would appreciate their input a lot.
My reasoning is that in the standard/canonical two-way fixed effects model with cross sectional id and time series t, the id fixed effects are orthogonal to the t fixed effects.
This is because the cross sectional id dummies vary only across id units (but not across time), and the time dummies vary only across time (but not across ids).
Therefore I reason that as the id dummies are orthogonal to the time dummies, I should be able to obtain the two-way fixed effects by a simple two stage procedure:
1. Residualise every variable in the regression, using say -areg-, with respect of the first fixed effect.
2. Use the residualised variables in a second -areg- regression absorbing the second fixed effects, where the regressors are the residualised variables from step 1.
What I think should work, almost work, but not exactly. So my question is why is this not exactly working? Do I have some error in the reasoning? Or is it a numerical issue?
Here is an illustration. Lets say I want to fit a two-way fixed effects regression using the nlswork data, where the dependent variable is ln_wage and the regressors are age and hours, and id is idcode and time is year:
1) Step one, I residualise all the variables with respect of one of the fixed effects, idcode in this example
Code:
. webuse nlswork, clear (National Longitudinal Survey. Young Women 14-26 years of age in 1968) . qui foreach var of varlist ln_wage age hours { . areg `var', absorb(idcode) . predict double `var'res, resid . }
Code:
. areg ln_wageres ageres hoursres, absorb(year) Linear regression, absorbing indicators Number of obs = 28,443 Absorbed variable: year No. of categories = 15 F( 2, 28426) = 579.90 Prob > F = 0.0000 R-squared = 0.1063 Adj R-squared = 0.1058 Root MSE = 0.2769 ------------------------------------------------------------------------------ ln_wageres | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ageres | .0182627 .000538 33.94 0.000 .0172082 .0193172 hoursres | .0007479 .0002189 3.42 0.001 .0003188 .0011769 _cons | .0000474 .0016419 0.03 0.977 -.0031708 .0032656 ------------------------------------------------------------------------------ F test of absorbed indicators: F(14, 28426) = 6.860 Prob > F = 0.000
Code:
. reghdfe ln_wage age hours, absorb(idcode year) (dropped 554 singleton observations) (MWFE estimator converged in 8 iterations) HDFE Linear regression Number of obs = 27,889 Absorbing 2 HDFE groups F( 2, 23718) = 5.75 Prob > F = 0.0032 R-squared = 0.6554 Adj R-squared = 0.5948 Within R-sq. = 0.0005 Root MSE = 0.3030 ------------------------------------------------------------------------------ ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0128924 .0102256 1.26 0.207 -.0071505 .0329353 hours | .000756 .0002397 3.15 0.002 .0002862 .0012258 _cons | 1.275635 .2973869 4.29 0.000 .692738 1.858533 ------------------------------------------------------------------------------ . areg ln_wage age hours i.year, absorb(idcode) Linear regression, absorbing indicators Number of obs = 28,443 Absorbed variable: idcode No. of categories = 4,709 F( 16, 23718) = 177.07 Prob > F = 0.0000 R-squared = 0.6648 Adj R-squared = 0.5980 Root MSE = 0.3030 ------------------------------------------------------------------------------ ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0128924 .0102256 1.26 0.207 -.0071505 .0329353 hours | .000756 .0002397 3.15 0.002 .0002862 .0012258 | year | 69 | .0741072 .0159121 4.66 0.000 .0429185 .1052959 70 | .0475207 .0235895 2.01 0.044 .0012839 .0937576 71 | .0860412 .0328227 2.62 0.009 .0217066 .1503757 72 | .0856385 .0425291 2.01 0.044 .0022789 .1689982 73 | .0875726 .0523902 1.67 0.095 -.0151154 .1902607 75 | .0765988 .072101 1.06 0.288 -.0647237 .2179213 77 | .1071687 .0923094 1.16 0.246 -.0737636 .288101 78 | .1293613 .1029087 1.26 0.209 -.0723463 .3310689 80 | .1119272 .1229027 0.91 0.362 -.1289699 .3528242 82 | .1075358 .1432406 0.75 0.453 -.173225 .3882966 83 | .1190697 .1533399 0.78 0.437 -.1814863 .4196257 85 | .1429657 .1737723 0.82 0.411 -.197639 .4835705 87 | .1339107 .1942917 0.69 0.491 -.2469135 .5147349 88 | .1745405 .2081783 0.84 0.402 -.2335022 .5825832 | _cons | 1.169893 .1956564 5.98 0.000 .7863941 1.553392 ------------------------------------------------------------------------------ F test of absorbed indicators: F(4708, 23718) = 8.643 Prob > F = 0.000
So does anybody see where I go wrong?
0 Response to Canonical two-way fixed effects with cross sectional id and time t: two stage procedure gives almost, but not exactly what I expect.
Post a Comment