Dear experts,

I have a panel dataset of 77 variables and approximately 57.000 observations for the years 2014 - 2018. Therefore I use dummy variables for the independent variable company size (klein mittel groß) and industry sector (LuF BB, etc.). Using this, I ran regress to determine the effect on the tax burden (ETR_un) of companies.

I am using xtreg in Stata 15.1.

My problem is that as soon as I add the company size to my regression in addition to the industry dummies, 2 variables are immediately omitted. Therefore, the values of the independent variables are skewed.

I know that to avoid a dummy trap, I can remove one variable from the industry dummies and one from the company size, but the values still remain skewed.


How can I get around this problem?


Code:
 xtreg ETR_un LuF BB Verarbeitendes Energieversorg Wasserversorg Baugewerbe Handel Verkehr Gastgewerbe Inform_Kommun Finanz_Versich Grunds
> tücks_Wohnungswesen FreiWissTech_DL wirts_DL ÖV Erziehung_Unterr Gesundheit_Sozialwesen Kunst_Unterhaltung_Erholung sonst_DL klein mittel
>  groß i.year, re 
note: sonst_DL omitted because of collinearity
note: groß omitted because of collinearity

Random-effects GLS regression                   Number of obs     =     57,217
Group variable: ID                              Number of groups  =     18,389

R-sq:                                           Obs per group:
     within  = 0.0013                                         min =          1
     between = 0.0589                                         avg =        3.1
     overall = 0.0337                                         max =          5

                                                Wald chi2(24)     =    1163.20
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

---------------------------------------------------------------------------------------------
                     ETR_un |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------------------+----------------------------------------------------------------
                        LuF |  -2.806153   1.724013    -1.63   0.104    -6.185157    .5728511
                         BB |   1.223227   1.931906     0.63   0.527    -2.563238    5.009692
             Verarbeitendes |   1.156536   .7854209     1.47   0.141    -.3828609    2.695932
             Energieversorg |  -1.362574   .8817548    -1.55   0.122    -3.090782    .3656336
              Wasserversorg |   1.335969   1.015181     1.32   0.188    -.6537506    3.325688
                 Baugewerbe |   .8637391   .8727683     0.99   0.322    -.8468553    2.574333
                     Handel |     2.5564    .789308     3.24   0.001     1.009385    4.103415
                    Verkehr |     1.4275   .8937224     1.60   0.110    -.3241637    3.179164
                Gastgewerbe |   2.483374   1.282375     1.94   0.053    -.0300348    4.996784
              Inform_Kommun |    2.36475   .8876878     2.66   0.008     .6249134    4.104586
             Finanz_Versich |   3.360829    .911962     3.69   0.000     1.573416    5.148241
  Grundstücks_Wohnungswesen |  -6.140547   .9226302    -6.66   0.000    -7.948869   -4.332225
            FreiWissTech_DL |   1.915703     .80875     2.37   0.018      .330582    3.500824
                   wirts_DL |    2.66731    .880347     3.03   0.002     .9418616    4.392758
                         ÖV |   6.128692   2.154682     2.84   0.004     1.905592    10.35179
           Erziehung_Unterr |  -7.485594   1.566971    -4.78   0.000     -10.5568   -4.414388
     Gesundheit_Sozialwesen |  -11.52747   .8984018   -12.83   0.000    -13.28831   -9.766636
Kunst_Unterhaltung_Erholung |   1.190228    1.34611     0.88   0.377      -1.4481    3.828556
                   sonst_DL |          0  (omitted)
                      klein |    .107322   .3644027     0.29   0.768    -.6068941    .8215381
                     mittel |  -.4489496   .2944976    -1.52   0.127    -1.026154    .1282552
                       groß |          0  (omitted)
                            |
                       year |
                        15  |   .3697264   .1749918     2.11   0.035     .0267489     .712704
                        16  |  -.2524269   .1744287    -1.45   0.148    -.5943008    .0894471
                        17  |   .3338847   .1742833     1.92   0.055    -.0077044    .6754738
                        18  |   .9312448   .2926599     3.18   0.001     .3576419    1.504848
                            |
                      _cons |   26.40237    .775345    34.05   0.000     24.88272    27.92202
----------------------------+----------------------------------------------------------------
                    sigma_u |    9.70098
                    sigma_e |   12.63551
                        rho |  .37085086   (fraction of variance due to u_i)
In the following you can see that the respective average tax rates of the industries and company sizes are not the same as in the output of the regression.

Code:
 tabstat ETR_un, statistics (count mean sd max min range) by(Branche)

Summary for variables: ETR_un
     by categories of: Branche (Branche)

         Branche |         N      mean        sd       max       min     range
-----------------+------------------------------------------------------------
1. Land- und For |       177  24.83884  12.81619  76.21348  1.072381   75.1411
2. Bergbau und G |       142  28.01898  17.38571  91.96083  1.116526   90.8443
3. Verarbeitende |     16119  28.02748  14.02538   99.6544  1.005321  98.64908
4. Energieversor |      2514  25.49997  15.52516  97.77159  1.019462  96.75213
5. Wasserversorg |      1067   27.8953  15.64467  99.73144  1.119681  98.61176
6. Baugewerbe/Ba |      2725  27.80223  12.44849  98.92137  1.014662  97.90671
7. Handel; Insta |     13455  29.22813  13.67265  99.76919  1.003844  98.76534
8. Verkehr und L |      2173  28.26321  14.86916  99.54535  1.024184  98.52117
9. Gastgewerbe/B |       417  29.41986  15.65624  99.04601  1.067991  97.97802
10. Information  |      2270  29.26193   15.2746  97.74427  1.017193  96.72708
11. Erbringung v |      1842  30.01445  18.04395  99.85857  1.026219  98.83235
12. Grundstücks- |      1679  20.97251  16.91162  99.36201  1.012189  98.34982
13. Erbringung v |      6944  28.79622  17.03924  99.88694  1.017734  98.86921
14. Erbringung v |      2441   29.4939    15.703  99.85537   1.02731  98.82806
15. Öffentliche  |       108  33.55072  27.06546  98.77544  1.449751  97.32569
16. Erziehung un |       206  20.88952  22.37715  99.39492  1.019612  98.37531
17. Gesundheits- |      1822  15.34987  16.67667    99.662  1.000133  98.66187
18. Kunst, Unter |       366  27.51446  19.84288  97.59387   1.18329  96.41058
19. Erbringung v |       750  27.34199  17.72626  98.51981  1.002463  97.51734
-----------------+------------------------------------------------------------
           Total |     57217  27.82532  15.28509  99.88694  1.000133  98.88681
------------------------------------------------------------------------------

Code:
tabstat ETR_un, statistics (count mean sd max min range) by(Größe_HP)  

Summary for variables: ETR_un
     by categories of: Größe_HP 

     Größe_HP |         N      mean        sd       max       min     range
--------------+------------------------------------------------------------
   große KapG |     45881  27.75923   15.2651  99.88694  1.001677  98.88526
  kleine KapG |      3250  28.64398   15.4276  99.27302  1.003844  98.26917
mittlere KapG |      8086  27.87132  15.33302  99.73144  1.000133   98.7313
--------------+------------------------------------------------------------
        Total |     57217  27.82532  15.28509  99.88694  1.000133  98.88681
---------------------------------------------------------------------------
Lastly, I wanted to ask whether I am correct with the REM regression? In the FEM, it showed me "omitted" for all industry dummies:

Code:
 xtreg ETR_un LuF BB Verarbeitendes Energieversorg Wasserversorg Baugewerbe Handel Verkehr Gastgewerbe Inform_Kommun Finanz_Versich Grunds
> tücks_Wohnungswesen FreiWissTech_DL wirts_DL ÖV Erziehung_Unterr Gesundheit_Sozialwesen Kunst_Unterhaltung_Erholung sonst_DL klein mittel
> , fe 
note: LuF omitted because of collinearity
note: BB omitted because of collinearity
note: Verarbeitendes omitted because of collinearity
note: Energieversorg omitted because of collinearity
note: Wasserversorg omitted because of collinearity
note: Baugewerbe omitted because of collinearity
note: Handel omitted because of collinearity
note: Verkehr omitted because of collinearity
note: Gastgewerbe omitted because of collinearity
note: Inform_Kommun omitted because of collinearity
note: Finanz_Versich omitted because of collinearity
note: Grundstücks_Wohnungswesen omitted because of collinearity
note: FreiWissTech_DL omitted because of collinearity
note: wirts_DL omitted because of collinearity
note: ÖV omitted because of collinearity
note: Erziehung_Unterr omitted because of collinearity
note: Gesundheit_Sozialwesen omitted because of collinearity
note: Kunst_Unterhaltung_Erholung omitted because of collinearity
note: sonst_DL omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =     57,217
Group variable: ID                              Number of groups  =     18,389

R-sq:                                           Obs per group:
     within  = 0.0007                                         min =          1
     between = 0.0002                                         avg =        3.1
     overall = 0.0001                                         max =          5

                                                F(2,38826)        =      13.26
corr(u_i, Xb)  = -0.0121                        Prob > F          =     0.0000

---------------------------------------------------------------------------------------------
                     ETR_un |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------------+----------------------------------------------------------------
                        LuF |          0  (omitted)
                         BB |          0  (omitted)
             Verarbeitendes |          0  (omitted)
             Energieversorg |          0  (omitted)
              Wasserversorg |          0  (omitted)
                 Baugewerbe |          0  (omitted)
                     Handel |          0  (omitted)
                    Verkehr |          0  (omitted)
                Gastgewerbe |          0  (omitted)
              Inform_Kommun |          0  (omitted)
             Finanz_Versich |          0  (omitted)
  Grundstücks_Wohnungswesen |          0  (omitted)
            FreiWissTech_DL |          0  (omitted)
                   wirts_DL |          0  (omitted)
                         ÖV |          0  (omitted)
           Erziehung_Unterr |          0  (omitted)
     Gesundheit_Sozialwesen |          0  (omitted)
Kunst_Unterhaltung_Erholung |          0  (omitted)
                   sonst_DL |          0  (omitted)
                      klein |   1.206493   .2799578     4.31   0.000     .6577689    1.755217
                     mittel |   .5408651   .1822532     2.97   0.003     .1836443     .898086
                      _cons |   27.68036   .0611299   452.81   0.000     27.56054    27.80017
----------------------------+----------------------------------------------------------------
                    sigma_u |  13.079854
                    sigma_e |  12.638967
                        rho |  .51713756   (fraction of variance due to u_i)
---------------------------------------------------------------------------------------------
F test that all u_i=0: F(18388, 38826) = 2.44                Prob > F = 0.0000
Many thanks.

Kind regards
Can