Collinearity issues with -areg- and not with -reghdfe-

Hi Statalist.

I am having trouble diagnosing a collinearity problem. Observations in my dataset are counties across years, ranging from 1996 to 2006. I am running a regression with county fixed effects. Intent, defier, m0, m1, m2 and m3 are binary variables and running is an integer that ranges from 0 to a 100.

-areg- drops some of the variables due to collinearity:

Code:

* areg drops variables
# delimit ;
    areg ${outcome}

    1.intent#0.defier#1.m0
    l1.1.intent#l1.0.defier#1.m1
    l2.1.intent#l2.0.defier#1.m2
    l3.1.intent#l3.0.defier#1.m3
    
    c.running#0.defier#1.m0
    l1.c.running#l1.0.defier#1.m1
    l2.c.running#l2.0.defier#1.m2
    l3.c.running#l3.0.defier#1.m3

    0.defier#m0
    0.l1.defier#m1
    0.l2.defier#m2
    0.l3.defier#m3    
        
                    
    if inrange(year,1996,2006)
    & insample                
    , cluster(cty) absorb(cty);
# delimit cr

note: 0L2.defier#1.m2#cL2.running omitted because of collinearity
note: 0L3.defier#1.m3#cL3.running omitted because of collinearity

Linear regression, absorbing indicators         Number of obs     =      1,100
                                                F(  10,     99)   =      11.86
                                                Prob > F          =     0.0000
                                                R-squared         =     0.5210
                                                Adj R-squared     =     0.4683
                                                Root MSE          =     1.4812

                                         (Std. Err. adjusted for 100 clusters in cty)
-------------------------------------------------------------------------------------
                    |               Robust
         unemp_rate |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
   intent#defier#m0 |
             1 0 1  |  -.9089798    .678034    -1.34   0.183    -2.254346    .4363869
                    |
  L.intent#L.defier#|
                 m1 |
             1 0 1  |  -1.139256   .6345052    -1.80   0.076    -2.398252    .1197395
                    |
                L2. |
             intent#|
       L2.defier#m2 |
             1 0 1  |    -1.0513    .492405    -2.14   0.035    -2.028339   -.0742618
                    |
                L3. |
             intent#|
       L3.defier#m3 |
             1 0 1  |  -1.969312   1.007312    -1.96   0.053    -3.968037    .0294126
                    |
defier#m0#c.running |
               0 1  |  -.0625051   .0289846    -2.16   0.033    -.1200169   -.0049933
                    |
        L.defier#m1#|
         cL.running |
               0 1  |  -.0364417   .0252344    -1.44   0.152    -.0865122    .0136287
                    |
       L2.defier#m2#|
        cL2.running |
               0 1  |          0  (omitted)
                    |
       L3.defier#m3#|
        cL3.running |
               0 1  |          0  (omitted)
                    |
          defier#m0 |
               0 1  |  -.0751216   .3238042    -0.23   0.817    -.7176194    .5673762
                    |
        L.defier#m1 |
               0 1  |   .0497029   .2771395     0.18   0.858    -.5002019    .5996077
                    |
       L2.defier#m2 |
               0 1  |   .4190932   .2787426     1.50   0.136    -.1339925    .9721789
                    |
       L3.defier#m3 |
               0 1  |   1.926908   .5189397     3.71   0.000     .8972196    2.956597
                    |
              _cons |   6.062341   .4269764    14.20   0.000     5.215127    6.909555
--------------------+----------------------------------------------------------------
                cty |   absorbed                                     (100 categories)

The omitted variables are collinear before 1999, but they are not after 2000, so they are not collinear in the entire sample. If I estimate the regression in the subsample after 2000, there aren't any collinearity warnings:

Code:

* areg doesn't drop variables if post 1999
# delimit ;
    areg ${outcome}

    1.intent#0.defier#1.m0
    l1.1.intent#l1.0.defier#1.m1
    l2.1.intent#l2.0.defier#1.m2
    l3.1.intent#l3.0.defier#1.m3
    
    c.running#0.defier#1.m0
    l1.c.running#l1.0.defier#1.m1
    l2.c.running#l2.0.defier#1.m2
    l3.c.running#l3.0.defier#1.m3

    0.defier#m0
    0.l1.defier#m1
    0.l2.defier#m2
    0.l3.defier#m3    
        
                    
    if inrange(year,2000,2006)
    & insample                
    , cluster(cty) absorb(cty);
# delimit cr

Linear regression, absorbing indicators         Number of obs     =        700
                                                F(  12,     99)   =       6.15
                                                Prob > F          =     0.0000
                                                R-squared         =     0.6168
                                                Adj R-squared     =     0.5444
                                                Root MSE          =     1.1326

                                         (Std. Err. adjusted for 100 clusters in cty)
-------------------------------------------------------------------------------------
                    |               Robust
         unemp_rate |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
   intent#defier#m0 |
             1 0 1  |   .4648492   .3104504     1.50   0.137    -.1511518     1.08085
                    |
  L.intent#L.defier#|
                 m1 |
             1 0 1  |  -.0860283   .2276794    -0.38   0.706    -.5377936    .3657371
                    |
                L2. |
             intent#|
       L2.defier#m2 |
             1 0 1  |  -.1695172   .3319051    -0.51   0.611     -.828089    .4890546
                    |
                L3. |
             intent#|
       L3.defier#m3 |
             1 0 1  |    .226809   .4802937     0.47   0.638    -.7261979    1.179816
                    |
defier#m0#c.running |
               0 1  |  -.0109445   .0076418    -1.43   0.155    -.0261074    .0042184
                    |
        L.defier#m1#|
         cL.running |
               0 1  |   .0101195   .0064702     1.56   0.121    -.0027188    .0229577
                    |
       L2.defier#m2#|
        cL2.running |
               0 1  |   .0129306   .0083487     1.55   0.125     -.003635    .0294962
                    |
       L3.defier#m3#|
        cL3.running |
               0 1  |  -.0003409   .0113759    -0.03   0.976    -.0229131    .0222313
                    |
          defier#m0 |
               0 1  |  -.5470303   .3260741    -1.68   0.097    -1.194032    .0999715
                    |
        L.defier#m1 |
               0 1  |  -.4926713   .2377283    -2.07   0.041    -.9643758   -.0209669
                    |
       L2.defier#m2 |
               0 1  |  -.0659969   .2842434    -0.23   0.817    -.6299975    .4980037
                    |
       L3.defier#m3 |
               0 1  |    .023994   .2798196     0.09   0.932    -.5312288    .5792167
                    |
              _cons |   6.464826   .2387412    27.08   0.000     5.991112     6.93854
--------------------+----------------------------------------------------------------
                cty |   absorbed                                     (100 categories)

If I estimate the regression in the full sample with -reghdfe- instead of areg, I don't get any collinearity warnings and I get very different results.

Code:

* reghdfe doesn't drop variables

# delimit ;
    reghdfe ${outcome}

    1.intent#0.defier#1.m0
    l1.1.intent#l1.0.defier#1.m1
    l2.1.intent#l2.0.defier#1.m2
    l3.1.intent#l3.0.defier#1.m3
    
    c.running#0.defier#1.m0
    l1.c.running#l1.0.defier#1.m1
    l2.c.running#l2.0.defier#1.m2
    l3.c.running#l3.0.defier#1.m3

    0.defier#m0
    0.l1.defier#m1
    0.l2.defier#m2
    0.l3.defier#m3    
        
                    
    if inrange(year,1996,2006)
    & insample                
    , cluster(cty) absorb(cty);
# delimit cr

(converged in 1 iterations)

HDFE Linear regression                            Number of obs   =      1,100
Absorbing 1 HDFE group                            F(  12,     99) =      22.41
Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                  R-squared       =     0.6201
                                                  Adj R-squared   =     0.5774
                                                  Within R-sq.    =     0.1373
Number of clusters (cty)     =        100         Root MSE        =     1.3204

                                         (Std. Err. adjusted for 100 clusters in cty)
-------------------------------------------------------------------------------------
                    |               Robust
         unemp_rate |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------------+----------------------------------------------------------------
   intent#defier#m0 |
             1 0 1  |   .0634621   .2583711     0.25   0.806    -.4492021    .5761264
                    |
  L.intent#L.defier#|
                 m1 |
             1 0 1  |  -.0027516   .1981444    -0.01   0.989    -.3959132    .3904099
                    |
                L2. |
             intent#|
       L2.defier#m2 |
             1 0 1  |  -.2087028   .2232519    -0.93   0.352     -.651683    .2342773
                    |
                L3. |
             intent#|
       L3.defier#m3 |
             1 0 1  |   .1684336   .2454237     0.69   0.494    -.3185403    .6554075
                    |
defier#m0#c.running |
               0 1  |   -.016822   .0066538    -2.53   0.013    -.0300245   -.0036195
                    |
        L.defier#m1#|
         cL.running |
               0 1  |   .0023001   .0047169     0.49   0.627    -.0070593    .0116596
                    |
       L2.defier#m2#|
        cL2.running |
               0 1  |   .0021448   .0048294     0.44   0.658    -.0074377    .0117274
                    |
       L3.defier#m3#|
        cL3.running |
               0 1  |   .0171726   .0048115     3.57   0.001     .0076255    .0267197
                    |
          defier#m0 |
               0 1  |  -.2761202   .2393532    -1.15   0.251    -.7510489    .1988084
                    |
        L.defier#m1 |
               0 1  |   -.453506    .149009    -3.04   0.003    -.7491721   -.1578398
                    |
       L2.defier#m2 |
               0 1  |  -.0499182   .1498133    -0.33   0.740    -.3471802    .2473439
                    |
       L3.defier#m3 |
               0 1  |   .8106551   .1642063     4.94   0.000     .4848342    1.136476
-------------------------------------------------------------------------------------

Absorbed degrees of freedom:
----------------------------------------------------------------------+
        Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
--------------------+-------------------------------------------------|
                cty |            0             100            100 *   |
----------------------------------------------------------------------+
* = fixed effect nested within cluster; treated as redundant for DoF computation

Any thoughts on why -areg- is diagnosing collinearity in the first regression, and why -reghdfe- doesn't return any collinearity warning?

Thank you.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Collinearity issues with -areg- and not with -reghdfe-
Collinearity issues with -areg- and not with -reghdfe-

0 Response to Collinearity issues with -areg- and not with -reghdfe-

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Collinearity issues with -areg- and not with -reghdfe- Collinearity issues with -areg- and not with -reghdfe-

0 Response to Collinearity issues with -areg- and not with -reghdfe-

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Collinearity issues with -areg- and not with -reghdfe-
Collinearity issues with -areg- and not with -reghdfe-