Hi everyone,

For my master thesis, I am analyzing the impact of the legal system of a country (i.e. common law versus civil law) on the earnings' forecast accuracy of security analysts. My data is composed of 628 firms in 16 countries during 5 years. My model is as follows:

EPAi,t = β0 + β1*LegalSysti,t + β2*LnSizei,t+ β3*Coveri,t + β4*Lossi,t + β5*Flevi,t + β6*Roei,t + εi,t, where, i and t correspond to the firm i at the year t ; and LegalSyst and Loss are dummy variables.

I ran some diagnostic tests and it seems that a fixed effect model is appropriate. But the problem is that my variable of interest (LegalSyst) is omitted (collinearity + time-invariant, I suppose) with the fixed effect model. Therefore, I cannot examine the effect of the legal system on my dependant variable. I have seen some threads suggesting going for "hybrid models". But I don't know how to perform it because I have basic knowledges of econometrics and Stata/SE 16.0.

(1) Is there another alternatives to fix the problem of omitted variable in order to get an estimated coefficient value ?

I tried to run "xtset CountryID Year" but I got the message "repeated time values within panel data" because I have multiple firms for every Country and Year. Therefore, I went with the following code:

Code:
. xtset EnterpriseID Year
       panel variable:  EnterpriseID (strongly balanced)
        time variable:  Year, 2014 to 2018
                delta:  1 unit
(2) Is this panel variable relevant for my analysis given the fact that I want to control country effect in my model? If no, how can I do it?

(3) Furthermore, for example If I want to analyze jointly 2 common law and 2 civil law countries in my sample, should I use "cluster" ? If yes, could you suggest me the syntax code ? (Note: CountryID is the variable that refers to the country. It can take the value from 1 to 16 depending on the corresponding country)

Code:
. xtreg EPA LegalSyst LnSize Cover Loss Flev Roe, fe
note: LegalSyst omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =      3,140
Group variable: EnterpriseID                    Number of groups  =        628

R-sq:                                           Obs per group:
     within  = 0.0704                                         min =          5
     between = 0.0447                                         avg =        5.0
     overall = 0.0331                                         max =          5

                                                F(5,2507)         =      37.99
corr(u_i, Xb)  = -0.7049                        Prob > F          =     0.0000

------------------------------------------------------------------------------
         EPA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   LegalSyst |          0  (omitted)
      LnSize |   -.021504    .006088    -3.53   0.000     -.033442    -.009566
       Cover |  -.0022359   .0005592    -4.00   0.000    -.0033324   -.0011394
        Loss |   .0692554   .0056121    12.34   0.000     .0582506    .0802602
        Flev |  -.0004474   .0008064    -0.55   0.579    -.0020287    .0011339
         Roe |  -.0012693   .0010576    -1.20   0.230    -.0033431    .0008044
       _cons |   .2085857   .0481035     4.34   0.000      .114259    .3029124
-------------+----------------------------------------------------------------
     sigma_u |  .07222027
     sigma_e |  .06471257
         rho |  .55466326   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(627, 2507) = 2.30                   Prob > F = 0.0000
Code:
estimates store fixed
Code:
. xtreg EPA LegalSyst LnSize Cover Loss Flev Roe, re

Random-effects GLS regression                   Number of obs     =      3,140
Group variable: EnterpriseID                    Number of groups  =        628

R-sq:                                           Obs per group:
     within  = 0.0601                                         min =          5
     between = 0.2990                                         avg =        5.0
     overall = 0.1551                                         max =          5

                                                Wald chi2(6)      =     411.26
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
         EPA |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   LegalSyst |   .0219064   .0046445     4.72   0.000     .0128033    .0310094
      LnSize |   .0045568   .0013606     3.35   0.001     .0018902    .0072235
       Cover |  -.0009433   .0002839    -3.32   0.001    -.0014997   -.0003869
        Loss |   .0848819   .0044507    19.07   0.000     .0761587    .0936052
        Flev |   .0006721   .0007244     0.93   0.353    -.0007476    .0020919
         Roe |  -.0008226   .0009642    -0.85   0.394    -.0027124    .0010673
       _cons |  -.0183203   .0088933    -2.06   0.039    -.0357508   -.0008897
-------------+----------------------------------------------------------------
     sigma_u |  .03070813
     sigma_e |  .06471257
         rho |  .18379327   (fraction of variance due to u_i)
------------------------------------------------------------------------------
Code:
estimates store random
Code:
. hausman fixed random

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |     fixed        random       Difference          S.E.
-------------+----------------------------------------------------------------
      LnSize |    -.021504     .0045568       -.0260608         .005934
       Cover |   -.0022359    -.0009433       -.0012926        .0004818
        Loss |    .0692554     .0848819       -.0156266        .0034186
        Flev |   -.0004474     .0006721       -.0011196        .0003544
         Roe |   -.0012693    -.0008226       -.0004468        .0004344
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                  chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =       58.99
                Prob>chi2 =      0.0000
According to Hausman test, I should use a fixed effect model.

Code:
. xtreg EPA LegalSyst LnSize Cover Loss Flev Roe i.Year,fe
note: LegalSyst omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =      3,140
Group variable: EnterpriseID                    Number of groups  =        628

R-sq:                                           Obs per group:
     within  = 0.0736                                         min =          5
     between = 0.0412                                         avg =        5.0
     overall = 0.0310                                         max =          5

                                                F(9,2503)         =      22.10
corr(u_i, Xb)  = -0.7333                        Prob > F          =     0.0000

------------------------------------------------------------------------------
         EPA |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   LegalSyst |          0  (omitted)
      LnSize |  -.0249718   .0068277    -3.66   0.000    -.0383603   -.0115833
       Cover |   -.001942   .0005912    -3.28   0.001    -.0031013   -.0007827
        Loss |   .0696952   .0056099    12.42   0.000     .0586946    .0806957
        Flev |  -.0004988    .000807    -0.62   0.537    -.0020813    .0010837
         Roe |  -.0012968    .001058    -1.23   0.220    -.0033715     .000778
             |
        Year |
       2015  |   .0057079   .0036651     1.56   0.120    -.0014791    .0128949
       2016  |   .0094027   .0036612     2.57   0.010     .0022234     .016582
       2017  |   .0093605    .003829     2.44   0.015     .0018522    .0168688
       2018  |   .0077649   .0039683     1.96   0.050    -.0000166    .0155465
             |
       _cons |   .2259927   .0527099     4.29   0.000     .1226332    .3293522
-------------+----------------------------------------------------------------
     sigma_u |  .07569337
     sigma_e |  .06465357
         rho |  .57817704   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(627, 2503) = 2.31                   Prob > F = 0.0000
Code:
. testparm i.Year

 ( 1)  2015.Year = 0
 ( 2)  2016.Year = 0
 ( 3)  2017.Year = 0
 ( 4)  2018.Year = 0

       F(  4,  2503) =    2.14
            Prob > F =    0.0729
The Prob>F is > 0.05, therefore no time fixed effects are needed in this case.

Code:
. xttest3

Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model

H0: sigma(i)^2 = sigma^2 for all i

chi2 (628)  =   9.4e+08
Prob>chi2 =      0.0000
According to this modified Wald test, there is a presence of heteroskedasticity.


I would very appreciate if you could help me. Thanks in advance.

Thanh