Dear Statalist,

I have panel data covering 763 firms over 15 years, taken from an industry consortium. I want to estimate how changes in the memberships across competing industry consortia, the number of simultaneous affiliations, the role within the focal consortium and the provision of a platform product (time-invariant) affect their product certifications. So the basic model would look like this:

productcerts_t = beta0 + beta1 * changemem_t-1 + beta2 * simulmem_t-1 + beta3 * role_t-1 + beta4 * platform + controls


While the model is rather straight forward, I am currently facing the issue that firms, in order to be able to certify products, are required to be members. Thus, I included a dummy variable member_t and its interaction terms with all other variables, except for role as it already requires member_t to be 1. However, that causes multicollinearity in a more complete model with all control variables and produces a large result set due to the interactions. The model then looks like this:

productcerts_t = beta0 + beta1 * changemem_t + beta2 * simulmem_t + beta3 * role_t + beta4 * platform + beta5 * member_t + beta6 * member_t * changemem_t + beta7 * member_t * simulmem_t + beta8 * member_t * platform + controls


I was wondering if there is a more elegant way that yields consistent results. Intuitively, I thought about filtering the observations, excluding all records where member_t == 0 and ran a pooled OLS with time dummies and clustered standard errors on id. But I am not sure if that is an appropriate approach.


Here are some results I computed:

1) pooled OLS with interactions and clustered standard errors
Code:
. reg productcerts i.member##c.L1.changemem i.member##c.L1.simulmem L1.role i.member##i.platform i.year, cluster(id)

Linear regression                               Number of obs     =     10,682
                                                F(21, 762)        =       5.34
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0789
                                                Root MSE          =     2.1651

                                            (Std. Err. adjusted for 763 clusters in id)
---------------------------------------------------------------------------------------
                      |               Robust
         productcerts |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
             1.member |   .4415453   .0704454     6.27   0.000     .3032552    .5798353
                      |
            changemem |
                  L1. |   .5532723   .5693309     0.97   0.331    -.5643709    1.670915
                      |
  member#cL.changemem |
                   1  |  -1.356654   .5850375    -2.32   0.021    -2.505131   -.2081776
                      |
             simulmem |
                  L1. |   .2238948   .1329231     1.68   0.093    -.0370441    .4848337
                      |
   member#cL.simulmem |
                   1  |   -.125708   .2387898    -0.53   0.599    -.5944721     .343056
                      |
                 role |
                  L1. |   3.496489   1.625868     2.15   0.032     .3047766    6.688201
                      |
           1.platform |   .1912549   .0916024     2.09   0.037      .011432    .3710779
                      |
      member#platform |
                 1 1  |    1.72839   .6695502     2.58   0.010     .4140079    3.042772
                      |
                 year |
                2007  |   .0248361    .064558     0.38   0.701    -.1018965    .1515687
                2008  |  -.0068504   .0425496    -0.16   0.872    -.0903788     .076678
                2009  |   .0131293   .0814032     0.16   0.872    -.1466718    .1729304
                2010  |  -.0718353   .0605614    -1.19   0.236    -.1907222    .0470516
                2011  |   .0194899   .0722105     0.27   0.787    -.1222652     .161245
                2012  |  -.0246552    .063305    -0.39   0.697    -.1489281    .0996177
                2013  |   .0549405   .0778698     0.71   0.481    -.0979243    .2078054
                2014  |  -.0239332    .068818    -0.35   0.728    -.1590286    .1111623
                2015  |   .1155241   .1268944     0.91   0.363      -.13358    .3646283
                2016  |   .1556162   .0833659     1.87   0.062     -.008038    .3192703
                2017  |   .2129104   .1003894     2.12   0.034     .0158378     .409983
                2018  |   .0882369   .0852473     1.04   0.301    -.0791104    .2555843
                2019  |   .2275257   .1756277     1.30   0.196    -.1172458    .5722973
                      |
                _cons |  -.0412679   .0527809    -0.78   0.435    -.1448811    .0623454
---------------------------------------------------------------------------------------

2) pooled OLS model with filtered observations, excluding records where member_t == 0
Code:
. reg productcerts c.L1.changemem c.L1.simulmem L1.role i.platform i.year if member, cluster(id)

Linear regression                               Number of obs     =      3,189
                                                F(17, 762)        =       4.06
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0612
                                                Root MSE          =     3.7748

                                    (Std. Err. adjusted for 763 clusters in id)
-------------------------------------------------------------------------------
              |               Robust
 productcerts |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
    changemem |
          L1. |  -.8704477    .463063    -1.88   0.061    -1.779478     .038583
              |
     simulmem |
          L1. |   .0518086   .2144131     0.24   0.809    -.3691018    .4727191
              |
         role |
          L1. |   3.817154   1.762396     2.17   0.031     .3574265    7.276881
              |
   1.platform |   1.894842   .6702957     2.83   0.005     .5789965    3.210687
              |
         year |
        2007  |   .0040985   .5051178     0.01   0.994    -.9874892    .9956862
        2008  |  -.1097982   .3155404    -0.35   0.728      -.72923    .5096335
        2009  |  -.1420047    .441294    -0.32   0.748    -1.008301    .7242916
        2010  |  -.3944345    .415614    -0.95   0.343    -1.210319      .42145
        2011  |    .057463   .4554769     0.13   0.900    -.8366756    .9516016
        2012  |  -.2025696   .4350108    -0.47   0.642    -1.056531    .6513922
        2013  |   .1460105   .4491477     0.33   0.745    -.7357033    1.027724
        2014  |  -.0861011   .4196607    -0.21   0.837    -.9099294    .7377273
        2015  |   .3009069   .4922737     0.61   0.541    -.6654667    1.267281
        2016  |   .3548497   .4243433     0.84   0.403    -.4781711     1.18787
        2017  |   .3556781   .4230937     0.84   0.401    -.4748896    1.186246
        2018  |   .1858268   .4198763     0.44   0.658    -.6384248    1.010078
        2019  |   .5825402   .5546162     1.05   0.294    -.5062169    1.671297
              |
        _cons |   .3286396   .4078965     0.81   0.421    -.4720947    1.129374
-------------------------------------------------------------------------------

The second model shows slight changes in the coefficient estimates.




Code:
. quietly: xtreg productcerts i.member##c.L1.changemem i.member##c.L1.simulmem L1.role i.member##i.platform i.year
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects

        productcerts[id,t] = Xb + u[id] + e[id,t]

        Estimated results:
                         |       Var     sd = sqrt(Var)
                ---------+-----------------------------
               product~s |   5.079473       2.253769
                       e |     4.0121       2.003023
                       u |   .6070492       .7791336

        Test:   Var(u) = 0
                             chibar2(01) =  1353.13
                          Prob > chibar2 =   0.0000

Further, the Breusch-Pagan ML test favors a model with random effects, Hausman Test and suest cannot be run on the data/models.



I would appreciate, If you could give me some suggestion how to succeed with this problematic. Would you recommend to stick with an RE/FE model and use the interactions? Is it legit under some assumptions to filter observations for pooled OLS? Or is there any other approach?


Best,
Sven