I want to run OLS regression on my dataset and I need advice on if I should choose areg or reghdfe or xtreg.
Data: I'm using a subsample of this data. Each observation is one birth, with information of the baby, the mom, and dad and birth time, state, county, etc. There's no id for the mother so I can't tell if they are same moms over different years or different moms over different years. I'm assuming they are different moms.
I'm asked to run the following regression:
Regression: Y_icmy = b0 + b1*EmploymentRate_cy + b2*X_i + u_ym + e_icmy
i: each birth
c: mom's residing county
m: birth month
y: birth year
Y_icmy are 4 different outcomes including, baby's birth weight indicator, delivery method, prenatal care visits pre-term birth.
EmploymentRate_cy is the employment rate of the county
X_i is a group of dummies (for eg, mom's education, race, ..)
u_ym is a set of year-by-month fixed effects of the conception time
standard err is clustered on the mom’s county
Stata Results: I tried xtreg, areg, and reghdfe and the code & results are as below:
xtreg:
Code:
xtset cntyrfip
foreach yvar in birthwt pretermbir csec nprevis{
xtreg `yvar' emptopop i.dummage i.dmeduc i.dummrace i.dmar i.csex i.dumbirorder conceptmodate, fe cluster(cntyrfip)
}Code:
panel variable: cntyrfip (unbalanced)
Fixed-effects (within) regression Number of obs = 415,258
Group variable: cntyrfip Number of groups = 28
R-sq: Obs per group:
within = 0.0132 min = 536
between = 0.3507 avg = 14,830.6
overall = 0.0133 max = 97,957
F(27,27) = .
corr(u_i, Xb) = -0.1707 Prob > F = .
(Std. Err. adjusted for 28 clusters in cntyrfip)
-------------------------------------------------------------------------------
| Robust
birthwt | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
emptopop | .0188474 .0422724 0.45 0.659 -.0678885 .1055833
|
dummage |
2 | -.0234066 .0014085 -16.62 0.000 -.0262965 -.0205166
3 | -.0197053 .0022694 -8.68 0.000 -.0243617 -.0150488
4 | -.0005011 .0025157 -0.20 0.844 -.0056628 .0046606
|
dmeduc |
1 | -.0060045 .012484 -0.48 0.634 -.0316195 .0196105
2 | -.009585 .0083465 -1.15 0.261 -.0267106 .0075406
3 | -.0201655 .009807 -2.06 0.050 -.0402877 -.0000433
4 | -.0212308 .0099151 -2.14 0.041 -.0415749 -.0008867
5 | -.0136463 .0039725 -3.44 0.002 -.0217972 -.0054953
6 | -.0267736 .004405 -6.08 0.000 -.0358119 -.0177353
7 | -.0163685 .0086613 -1.89 0.070 -.03414 .001403
8 | -.016259 .0078926 -2.06 0.049 -.0324533 -.0000647
9 | -.0179093 .0059974 -2.99 0.006 -.0302151 -.0056036
10 | -.0159399 .006219 -2.56 0.016 -.0287002 -.0031796
11 | -.0152933 .0070564 -2.17 0.039 -.0297719 -.0008147
12 | -.0236959 .0065388 -3.62 0.001 -.0371124 -.0102793
13 | -.0281477 .0068223 -4.13 0.000 -.0421458 -.0141495
14 | -.0305678 .0065424 -4.67 0.000 -.0439918 -.0171439
15 | -.0319048 .0066587 -4.79 0.000 -.0455673 -.0182422
16 | -.036879 .0070394 -5.24 0.000 -.0513228 -.0224353
17 | -.0372261 .0077217 -4.82 0.000 -.0530698 -.0213824
|
dummrace |
2 | .0545832 .0013128 41.58 0.000 .0518895 .0572769
3 | -.0051098 .0015808 -3.23 0.003 -.0083533 -.0018662
4 | .011673 .0025248 4.62 0.000 .0064926 .0168535
|
2.dmar | .0237415 .0017259 13.76 0.000 .0202003 .0272827
2.csex | .0097996 .0006279 15.61 0.000 .0085112 .011088
4.dumbirorder | -.0609698 .0063748 -9.56 0.000 -.0740499 -.0478897
conceptmodate | .0001519 .0000231 6.58 0.000 .0001045 .0001992
_cons | .024313 .0177508 1.37 0.182 -.0121087 .0607347
--------------+----------------------------------------------------------------
sigma_u | .00738704
sigma_e | .23620757
rho | .00097708 (fraction of variance due to u_i)
-------------------------------------------------------------------------------
Fixed-effects (within) regression Number of obs = 415,258
Group variable: cntyrfip Number of groups = 28
R-sq: Obs per group:
within = 0.0146 min = 536
between = 0.0797 avg = 14,830.6
overall = 0.0096 max = 97,957
F(27,27) = .
corr(u_i, Xb) = -0.6292 Prob > F = .
(Std. Err. adjusted for 28 clusters in cntyrfip)
-------------------------------------------------------------------------------
| Robust
pretermbir | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
emptopop | .2171136 .0674441 3.22 0.003 .0787297 .3554975
|
dummage |
2 | -.043691 .0026736 -16.34 0.000 -.0491767 -.0382052
3 | -.0487423 .0033546 -14.53 0.000 -.0556255 -.0418592
4 | -.0188399 .0034741 -5.42 0.000 -.0259681 -.0117117
|
dmeduc |
1 | .013965 .0137682 1.01 0.319 -.0142851 .042215
2 | .009339 .0068673 1.36 0.185 -.0047516 .0234296
3 | -.0030966 .0064848 -0.48 0.637 -.0164022 .010209
4 | -.0130372 .0096425 -1.35 0.188 -.032822 .0067475
5 | -.0129679 .0050734 -2.56 0.017 -.0233778 -.0025581
6 | -.0282894 .0070967 -3.99 0.000 -.0428506 -.0137283
7 | -.0197083 .008737 -2.26 0.032 -.037635 -.0017815
8 | -.014648 .0061302 -2.39 0.024 -.0272261 -.00207
9 | -.0230106 .0072043 -3.19 0.004 -.0377926 -.0082286
10 | -.019993 .0058661 -3.41 0.002 -.0320293 -.0079568
11 | -.0193685 .0073412 -2.64 0.014 -.0344315 -.0043055
12 | -.0248285 .0069535 -3.57 0.001 -.0390959 -.0105612
13 | -.0286812 .0071548 -4.01 0.000 -.0433618 -.0140007
14 | -.0267416 .0076261 -3.51 0.002 -.0423891 -.011094
15 | -.0294642 .0061119 -4.82 0.000 -.0420048 -.0169236
16 | -.0382608 .007095 -5.39 0.000 -.0528185 -.0237031
17 | -.0387698 .0078232 -4.96 0.000 -.0548217 -.0227179
|
dummrace |
2 | .0682115 .0034626 19.70 0.000 .0611069 .075316
3 | .0084065 .0036133 2.33 0.028 .0009926 .0158204
4 | .0143703 .0030856 4.66 0.000 .0080393 .0207014
|
2.dmar | .0324672 .00117 27.75 0.000 .0300665 .0348679
2.csex | -.0087205 .0008703 -10.02 0.000 -.0105062 -.0069348
4.dumbirorder | -.1525281 .0091189 -16.73 0.000 -.1712385 -.1338176
conceptmodate | .0002491 .0000394 6.32 0.000 .0001682 .00033
_cons | -.01957 .0248677 -0.79 0.438 -.0705943 .0314542
--------------+----------------------------------------------------------------
sigma_u | .02551021
sigma_e | .30501554
rho | .00694635 (fraction of variance due to u_i)
-------------------------------------------------------------------------------
Fixed-effects (within) regression Number of obs = 415,258
Group variable: cntyrfip Number of groups = 28
R-sq: Obs per group:
within = . min = 536
between = . avg = 14,830.6
overall = . max = 97,957
F(0,27) = .
corr(u_i, Xb) = . Prob > F = .
(Std. Err. adjusted for 28 clusters in cntyrfip)
-------------------------------------------------------------------------------
| Robust
csec | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
emptopop | 0 (omitted)
|
dummage |
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
|
dmeduc |
1 | 0 (omitted)
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
5 | 0 (omitted)
6 | 0 (omitted)
7 | 0 (omitted)
8 | 0 (omitted)
9 | 0 (omitted)
10 | 0 (omitted)
11 | 0 (omitted)
12 | 0 (omitted)
13 | 0 (omitted)
14 | 0 (omitted)
15 | 0 (omitted)
16 | 0 (omitted)
17 | 0 (omitted)
|
dummrace |
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
|
2.dmar | 0 (omitted)
2.csex | 0 (omitted)
4.dumbirorder | 0 (omitted)
conceptmodate | 0 (omitted)
_cons | 0 (omitted)
--------------+----------------------------------------------------------------
sigma_u | 0
sigma_e | 0
rho | . (fraction of variance due to u_i)
-------------------------------------------------------------------------------
Fixed-effects (within) regression Number of obs = 415,258
Group variable: cntyrfip Number of groups = 28
R-sq: Obs per group:
within = 0.1525 min = 536
between = 0.5075 avg = 14,830.6
overall = 0.1613 max = 97,957
F(27,27) = .
corr(u_i, Xb) = 0.0055 Prob > F = .
(Std. Err. adjusted for 28 clusters in cntyrfip)
-------------------------------------------------------------------------------
| Robust
nprevis | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
emptopop | -1.909821 3.595874 -0.53 0.600 -9.287944 5.468302
|
dummage |
2 | .819195 .0677186 12.10 0.000 .6802479 .9581422
3 | 1.592506 .0703909 22.62 0.000 1.448076 1.736936
4 | 1.838432 .0741365 24.80 0.000 1.686316 1.990547
|
dmeduc |
1 | .9760593 .3967149 2.46 0.021 .1620674 1.790051
2 | .7211828 .1885137 3.83 0.001 .3343847 1.107981
3 | .8592792 .4071324 2.11 0.044 .0239126 1.694646
4 | .9825091 .3128487 3.14 0.004 .3405965 1.624422
5 | .9184139 .3094473 2.97 0.006 .2834804 1.553347
6 | 1.243555 .459368 2.71 0.012 .3010099 2.186101
7 | 1.986289 .5034672 3.95 0.001 .9532593 3.019318
8 | 2.233033 .5824491 3.83 0.001 1.037947 3.42812
9 | 2.199378 .5549521 3.96 0.000 1.06071 3.338045
10 | 2.760089 .5764386 4.79 0.000 1.577334 3.942843
11 | 2.874429 .6213346 4.63 0.000 1.599556 4.149303
12 | 3.712736 .6371179 5.83 0.000 2.405478 5.019994
13 | 4.105504 .6737145 6.09 0.000 2.723156 5.487852
14 | 4.233067 .6717071 6.30 0.000 2.854838 5.611296
15 | 4.240404 .6982558 6.07 0.000 2.807701 5.673106
16 | 4.405252 .696915 6.32 0.000 2.9753 5.835203
17 | 4.456904 .7533609 5.92 0.000 2.911135 6.002673
|
dummrace |
2 | -.7893785 .0684183 -11.54 0.000 -.9297612 -.6489959
3 | -1.095532 .1876042 -5.84 0.000 -1.480464 -.7106
4 | -1.006129 .0555952 -18.10 0.000 -1.120201 -.8920574
|
2.dmar | -1.592861 .0685017 -23.25 0.000 -1.733415 -1.452307
2.csex | .0508665 .0097061 5.24 0.000 .0309511 .0707818
4.dumbirorder | -1.412481 .4557176 -3.10 0.004 -2.347536 -.4774254
conceptmodate | .0170811 .0034858 4.90 0.000 .0099288 .0242334
_cons | 1.180458 1.121609 1.05 0.302 -1.120894 3.48181
--------------+----------------------------------------------------------------
sigma_u | .73037114
sigma_e | 4.1227098
rho | .03042992 (fraction of variance due to u_i)
-------------------------------------------------------------------------------Code:
foreach yvar in birthwt pretermbir csec nprevis{
xtreg `yvar' emptopop i.dummage i.dmeduc i.dummrace i.dmar i.csex i.dumbirorder, absorb(conceptmodate) cluster(cntyrfip)
}Code:
Linear regression, absorbing indicators Number of obs = 415,258
F( 27, 27) = 1328.66
Prob > F = 0.0000
R-squared = 0.2888
Adj R-squared = 0.2853
Root MSE = 0.2011
(Std. Err. adjusted for 28 clusters in cntyrfip)
-------------------------------------------------------------------------------
| Robust
birthwt | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
emptopop | .0008996 .0070898 0.13 0.900 -.0136474 .0154466
|
dummage |
2 | -.0047123 .0012823 -3.67 0.001 -.0073435 -.0020812
3 | .0000784 .0015011 0.05 0.959 -.0030015 .0031583
4 | .0076676 .0019832 3.87 0.001 .0035984 .0117368
|
dmeduc |
1 | -.0038257 .010759 -0.36 0.725 -.0259014 .01825
2 | -.0139574 .0073547 -1.90 0.068 -.029048 .0011331
3 | -.0135079 .006974 -1.94 0.063 -.0278174 .0008015
4 | -.0108141 .0083696 -1.29 0.207 -.027987 .0063588
5 | -.0042134 .0051607 -0.82 0.421 -.0148022 .0063754
6 | -.0105367 .0046365 -2.27 0.031 -.02005 -.0010234
7 | -.0050185 .0071526 -0.70 0.489 -.0196944 .0096575
8 | -.0067927 .0069436 -0.98 0.337 -.0210397 .0074544
9 | -.0041783 .0052189 -0.80 0.430 -.0148867 .0065301
10 | -.0031763 .0052651 -0.60 0.551 -.0139794 .0076267
11 | -.0041207 .0052407 -0.79 0.439 -.0148738 .0066323
12 | -.0102089 .0056143 -1.82 0.080 -.0217285 .0013108
13 | -.0136341 .0058264 -2.34 0.027 -.0255888 -.0016794
14 | -.0156871 .0052049 -3.01 0.006 -.0263667 -.0050076
15 | -.0164624 .0066196 -2.49 0.019 -.0300447 -.0028802
16 | -.0187018 .0059169 -3.16 0.004 -.0308422 -.0065615
17 | -.0186256 .0060682 -3.07 0.005 -.0310765 -.0061747
|
dummrace |
2 | .0223933 .0011831 18.93 0.000 .0199659 .0248207
3 | -.0082914 .0009647 -8.59 0.000 -.0102708 -.006312
4 | .0055149 .0019515 2.83 0.009 .0015107 .0095191
|
2.dmar | .0104538 .001398 7.48 0.000 .0075853 .0133223
2.csex | .0135626 .0006175 21.96 0.000 .0122956 .0148296
4.dumbirorder | -.0188707 .0078462 -2.41 0.023 -.0349698 -.0027716
_cons | .0634819 .0058863 10.78 0.000 .0514042 .0755596
--------------+----------------------------------------------------------------
conceptmodate | absorbed (1985 categories)
Linear regression, absorbing indicators Number of obs = 415,258
F( 0, 27) = .
Prob > F = .
R-squared = 1.0000
Adj R-squared = 1.0000
Root MSE = 0.0000
(Std. Err. adjusted for 28 clusters in cntyrfip)
-------------------------------------------------------------------------------
| Robust
pretermbir | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
emptopop | 0 (omitted)
|
dummage |
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
|
dmeduc |
1 | 0 (omitted)
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
5 | 0 (omitted)
6 | 0 (omitted)
7 | 0 (omitted)
8 | 0 (omitted)
9 | 0 (omitted)
10 | 0 (omitted)
11 | 0 (omitted)
12 | 0 (omitted)
13 | 0 (omitted)
14 | 0 (omitted)
15 | 0 (omitted)
16 | 0 (omitted)
17 | 0 (omitted)
|
dummrace |
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
|
2.dmar | 0 (omitted)
2.csex | 0 (omitted)
4.dumbirorder | 0 (omitted)
_cons | .1056115 . . . . .
--------------+----------------------------------------------------------------
conceptmodate | absorbed (1985 categories)
Linear regression, absorbing indicators Number of obs = 415,258
F( 0, 27) = .
Prob > F = .
Root MSE = 0.0000
(Std. Err. adjusted for 28 clusters in cntyrfip)
-------------------------------------------------------------------------------
| Robust
csec | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
emptopop | 0 (omitted)
|
dummage |
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
|
dmeduc |
1 | 0 (omitted)
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
5 | 0 (omitted)
6 | 0 (omitted)
7 | 0 (omitted)
8 | 0 (omitted)
9 | 0 (omitted)
10 | 0 (omitted)
11 | 0 (omitted)
12 | 0 (omitted)
13 | 0 (omitted)
14 | 0 (omitted)
15 | 0 (omitted)
16 | 0 (omitted)
17 | 0 (omitted)
|
dummrace |
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
|
2.dmar | 0 (omitted)
2.csex | 0 (omitted)
4.dumbirorder | 0 (omitted)
_cons | 0 (omitted)
--------------+----------------------------------------------------------------
conceptmodate | absorbed (1985 categories)
Linear regression, absorbing indicators Number of obs = 415,258
F( 27, 27) = 7839401.62
Prob > F = 0.0000
R-squared = 0.1825
Adj R-squared = 0.1785
Root MSE = 4.1466
(Std. Err. adjusted for 28 clusters in cntyrfip)
-------------------------------------------------------------------------------
| Robust
nprevis | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
emptopop | .3913594 .8524499 0.46 0.650 -1.357723 2.140442
|
dummage |
2 | .6713209 .0607689 11.05 0.000 .5466333 .7960085
3 | 1.440154 .0822093 17.52 0.000 1.271474 1.608833
4 | 1.77644 .0759003 23.40 0.000 1.620705 1.932174
|
dmeduc |
1 | .9436234 .3958303 2.38 0.024 .1314466 1.7558
2 | .7093487 .2250707 3.15 0.004 .2475418 1.171156
3 | .7419543 .4287262 1.73 0.095 -.1377193 1.621628
4 | .8791356 .3429686 2.56 0.016 .1754221 1.582849
5 | .7503109 .3432982 2.19 0.038 .0459211 1.454701
6 | .9775896 .5125862 1.91 0.067 -.0741503 2.02933
7 | 1.878452 .5276921 3.56 0.001 .7957174 2.961187
8 | 2.154712 .6022926 3.58 0.001 .91891 3.390515
9 | 2.084571 .5881248 3.54 0.001 .8778384 3.291303
10 | 2.711758 .595407 4.55 0.000 1.490084 3.933432
11 | 2.848627 .6411 4.44 0.000 1.533199 4.164056
12 | 3.694534 .6486478 5.70 0.000 2.363619 5.025449
13 | 4.072212 .6832652 5.96 0.000 2.670267 5.474156
14 | 4.190809 .6825376 6.14 0.000 2.790358 5.591261
15 | 4.169677 .7214367 5.78 0.000 2.689411 5.649943
16 | 4.374855 .6988821 6.26 0.000 2.940867 5.808842
17 | 4.464445 .7446155 6.00 0.000 2.93662 5.99227
|
dummrace |
2 | -.7008378 .0501316 -13.98 0.000 -.8036994 -.5979761
3 | -1.135264 .3336207 -3.40 0.002 -1.819797 -.4507306
4 | -.9966281 .0519124 -19.20 0.000 -1.103144 -.8901126
|
2.dmar | -1.558454 .0886083 -17.59 0.000 -1.740264 -1.376645
2.csex | .0359697 .0105536 3.41 0.002 .0143155 .0576239
4.dumbirorder | -1.361785 .4532831 -3.00 0.006 -2.291845 -.4317251
_cons | 6.885657 .6308294 10.92 0.000 5.591302 8.180012
--------------+----------------------------------------------------------------
conceptmodate | absorbed (1985 categories)Now, my questions are:
1. Depending on my data and model, shall I choose xtreg or areg? If choosing xtreg, how should I setup my panal variable and time variable in xtset.
2. Neither xtreg and areg generate ideal results: how do I deal with omitted variables and insignificant coefficients? While I understand it's common, but if it is because of the mishandling of the data of the misusing the command in Statal, what could be possible reasons and how should I debug? I can't change the model.
Thanks and any inputs are greatly appreciated!!
0 Response to areg vs reghdfe vs xtreg with a group of controlling variables and fixed effects
Post a Comment