Using two panel datasets simultaneously: time dummies and cohort dummy

Hi I have a question about using two panel datasets simultaneously.

My data consists of two cohorts (2005 cohort and 2015 cohort)
The first cohort starts on 2005 and end on 2007.
The second cohort starts on 2015 and end on 2017.
I appended these two panel datasets and the results are below.

Code:

. list pid year peducost male cohort if 6904 <= pid & pid <= 10005, sep(15)

       +-----------------------------------------+
       |   pid   year   peducost   male   cohort |
       |-----------------------------------------|
20710. |  6904   2005          0      1     2005 |
20711. |  6904   2006          0      1     2005 |
20712. |  6904   2007          0      1     2005 |
20713. |  6905   2005         30      1     2005 |
20714. |  6905   2006         50      1     2005 |
20715. |  6905   2007         58      1     2005 |
20716. |  6906   2005         12      1     2005 |
20717. |  6906   2006         27      1     2005 |
20718. |  6906   2007         22      1     2005 |
20719. |  6907   2005         18      1     2005 |
20720. |  6907   2006         27      1     2005 |
20721. |  6907   2007         18      1     2005 |
20722. |  6908   2005          0      1     2005 |
20723. |  6908   2006         75      1     2005 |
20724. |  6908   2007         26      1     2005 |
       |-----------------------------------------|
20725. | 10001   2015          0      0     2015 |
20726. | 10001   2016          0      0     2015 |
20727. | 10001   2017          0      0     2015 |
20728. | 10002   2015          9      0     2015 |
20729. | 10002   2016          0      0     2015 |
20730. | 10002   2017          0      0     2015 |
20731. | 10003   2015          0      0     2015 |
20732. | 10003   2016          0      0     2015 |
20733. | 10003   2017         34      0     2015 |
20734. | 10004   2015          0      1     2015 |
20735. | 10004   2016          0      1     2015 |
20736. | 10004   2017          0      1     2015 |
20737. | 10005   2015          0      0     2015 |
20738. | 10005   2016          0      0     2015 |
20739. | 10005   2017          0      0     2015 |
       +-----------------------------------------+

where pid is the personal id, which is bigger than 10000 if the person is in 2015 cohort, peducost is the private education cost, and male is the dummy variable equal to one if the person is male.
That is I am using two panel datasets simultaneously (2005 cohort set and 2015 cohort set).

Here, I want to know whether the partial effects of gender on private education cost are different between the two cohorts.
So, I run a regression with an interaction term like below.

Code:

. xtset pid year
       panel variable:  pid (unbalanced)
        time variable:  year, 2005 to 2017, but with gaps
                delta:  1 unit

. global ctrlvar "dadage dadagesq momage momagesq i.dadedu i.momedu"

. 
. gen dummy_2015 = (cohort == 2015)

. xtreg peducost 1.male#1.dummy_2015 male $ctrlvar i.urbrur b2005.year i.dummy_2015, re vce(cl pid)
note: 1.dummy_2015 omitted because of collinearity

Random-effects GLS regression                   Number of obs     =     31,735
Group variable: pid                             Number of groups  =      6,836

R-sq:                                           Obs per group:
     within  = 0.1524                                         min =          1
     between = 0.2032                                         avg =        4.6
     overall = 0.1707                                         max =          6

                                                Wald chi2(18)     =    3477.54
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                                   (Std. Err. adjusted for 6,836 clusters in pid)
---------------------------------------------------------------------------------
                |               Robust
       peducost |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
male#dummy_2015 |
           1 1  |   -.720143    .891734    -0.81   0.419     -2.46791    1.027624
                |
           male |   1.362174   .5742116     2.37   0.018     .2367403    2.487609
         dadage |   1.108523   .4977059     2.23   0.026     .1330374    2.084009
       dadagesq |  -.0124439   .0051745    -2.40   0.016    -.0225856   -.0023022
         momage |   1.653206   .4077752     4.05   0.000     .8539813    2.452431
       momagesq |  -.0164883   .0043175    -3.82   0.000    -.0249505   -.0080262
                |
         dadedu |
   high_school  |   2.382997   .8907487     2.68   0.007     .6371611    4.128832
    university  |   10.57592   .9752616    10.84   0.000     8.664444     12.4874
                |
         momedu |
   high_school  |   3.838404   .8931592     4.30   0.000     2.087844    5.588964
    university  |   12.82762   1.061476    12.08   0.000     10.74717    14.90807
                |
         urbrur |
      big_city  |  -8.737641   .8099345   -10.79   0.000    -10.32508   -7.150199
          city  |    -9.8842   .7329264   -13.49   0.000    -11.32071   -8.447691
         rural  |  -15.79478   .8363283   -18.89   0.000    -17.43395   -14.15561
                |
           year |
          2006  |   3.182023   .2959997    10.75   0.000     2.601874    3.762171
          2007  |   11.30476    .491985    22.98   0.000     10.34049    12.26903
          2015  |   11.79945   .6385644    18.48   0.000     10.54789    13.05101
          2016  |   13.24131   .6740927    19.64   0.000     11.92011    14.56251
          2017  |   15.41846   .7222268    21.35   0.000     14.00292      16.834
                |
   1.dummy_2015 |          0  (omitted)
          _cons |  -54.29597   10.39154    -5.23   0.000    -74.66301   -33.92893
----------------+----------------------------------------------------------------
        sigma_u |  12.830406
        sigma_e |  23.443425
            rho |  .23049036   (fraction of variance due to u_i)
---------------------------------------------------------------------------------

where ctrlvar and urbrur mean control variables and urban or rural area variable, respectively.

Here, the problem is that the dummy_2015 variable (that is one if a person is in the 2015 cohort) is omitted.
I think, the dummy_2015 and time dummies cannot be used together because of the multicollinearity.

One solution is that I just use cross-sectional data (For example, combining 2005 and 2015 data).
But, due to my personal reason, I want to use two panel datasets simultaneously.

In this case, how can I test whether the partial effects of gender is different between the two cohorts?

Thank you for your time spent to read this question.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Using two panel datasets simultaneously: time dummies and cohort dummy
Using two panel datasets simultaneously: time dummies and cohort dummy

0 Response to Using two panel datasets simultaneously: time dummies and cohort dummy

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Using two panel datasets simultaneously: time dummies and cohort dummy Using two panel datasets simultaneously: time dummies and cohort dummy

Related Posts with Using two panel datasets simultaneously: time dummies and cohort dummy

0 Response to Using two panel datasets simultaneously: time dummies and cohort dummy

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Using two panel datasets simultaneously: time dummies and cohort dummy
Using two panel datasets simultaneously: time dummies and cohort dummy