Today I face a strange situation that the number of observations shrinking when I expand the sample size
In particular, the numbers of observations for variable x1 and x2 in UNITEDS in my samples are
count if x1 != . & inlist(GEOGN, "UNITEDS")
count if x2 != . & inlist(GEOGN, "UNITEDS")
The result for these two variables are the same
Array
Then, I try to run the regression of x2 on x1 for this country (UNITEDS)
Code:
. reghdfe x1 x2 if inlist(GEOGN, "UNITEDS"), a(TYPE2 INDC32#yr)
(dropped 1013 singleton observations)
note: x2 is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
(MWFE estimator converged in 14 iterations)
note: x2 omitted because of collinearity
HDFE Linear regression Number of obs = 54,409
Absorbing 2 HDFE groups F( 0, 47843) = .
Prob > F = .
R-squared = 0.8063
Adj R-squared = 0.7797
Within R-sq. = 0.0000
Root MSE = 0.3916
------------------------------------------------------------------------------
x1 | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
x2 | 0 (omitted)
_cons | 1.307023 .0016788 778.54 0.000 1.303733 1.310314
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
TYPE2 | 6131 0 6131 |
INDC32#yr | 450 15 435 |
-----------------------------------------------------+Code:
. reghdfe x1 x2 if inlist(GEOGN, "CHINA" "UNITEDS" "INDONESIA" "RUSSIAN" "MEXICO" "JAPAN" "PHILIPPINES" "VIETNAM" "SOUTHKOREA") | inlist(GEOGN,"COLOMBIA" "CANADA" "P
> ERU" "MALAYSIA" "AUSTRALIA" "CHILE" "ECUADOR" "SINGAPORE" "NEWZEALAND"), a(TYPE2 INDC32#yr)
(dropped 194 singleton observations)
(MWFE estimator converged in 14 iterations)
HDFE Linear regression Number of obs = 22,689
Absorbing 2 HDFE groups F( 1, 18715) = 0.07
Prob > F = 0.7857
R-squared = 0.7423
Adj R-squared = 0.6876
Within R-sq. = 0.0000
Root MSE = 0.2734
------------------------------------------------------------------------------
x1| Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
x2 | .0160276 .058948 0.27 0.786 -.0995158 .1315709
_cons | .7591069 .0458817 16.54 0.000 .6691746 .8490393
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
TYPE2 | 3614 0 3614 |
INDC32#yr | 374 15 359 |
-----------------------------------------------------+As suggested by Ken Chui, I apply another way to deal with a subsample of countries (https://www.statalist.org/forums/for...st2-in-my-code)
And it turns out that the number of observation for the expanded sample are much bigger
Code:
gen include = 0
foreach ctry in CHINA UNITEDS INDONESIA RUSSIAN MEXICO JAPAN PHILIPPINES ///
VIETNAM SOUTHKOREA COLOMBIA CANADA PERU MALAYSIA AUSTRALIA ///
CHILE ECUADOR SINGAPORE NEWZEALAND{
replace include = 1 if GEOGN == "`ctry'"
}
reghdfe x1 x2 if include == 1, a(TYPE2 INDC32#yr)Code:
. reghdfe x1 x2 if include == 1, a(TYPE2 INDC32#yr)
(dropped 2165 singleton observations)
(MWFE estimator converged in 13 iterations)
HDFE Linear regression Number of obs = 232,994
Absorbing 2 HDFE groups F( 1, 209389) = 88.97
Prob > F = 0.0000
R-squared = 0.8183
Adj R-squared = 0.7978
Within R-sq. = 0.0004
Root MSE = 0.3176
------------------------------------------------------------------------------
x1 | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
x2 | .0282004 .0029897 9.43 0.000 .0223407 .0340601
_cons | 1.079796 .0023016 469.15 0.000 1.075285 1.084307
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
TYPE2 | 23169 0 23169 |
INDC32#yr | 450 15 435 |
-----------------------------------------------------+
No comments:
Post a Comment