I am relatively new to Stata, so please bear with me. Also, I am aware that this issue has already been addressed on this forum, but I don't seem to be able to find the solution to my problem.
I am using logit in Stata 15.1 to understand whether migration changes employment outcomes. I am using an unbalanced dataset. For the purpose of this explanation, I will use the most basic specification (i.e. without any socio-economic control variables and without margins which I use at a later stage).
I am typing:
Code:
logit employed i.migrant##i.migration i.year, cluster(ident)
- employed is a binary variable equal to 1 for years when respondents were economically active, 0 otherwise.
- migrant is a 'treatment': a time-invariant binary variable for control and treatment groups, which equals 1 for migrants (those who migrated); 0 for non-migrants (those who stayed behind).
- migration is 'time' or 'post': a binary variable equal to 1 for years after migration, 0 for years before migration. As such, migration == 0 for both groups in the years before migration, but migration == 1 only for 1 group who underwent the treatment, i.e. migrants.
The problem I encounter is as follows: the interaction term is omitted due to collinearity (while both migrant & migration are estimated without problems). More specifically, I obtain the following output:
Code:
logit employed i.migrant##i.l_mig2 i.year, cluster(ident)
note: 1950.year != 0 predicts success perfectly
1950.year dropped and 1 obs not used
note: 0.migrant#1.l_mig2 identifies no observations in the sample
note: 1.migrant#1.l_mig2 omitted because of collinearity
note: 2009.year omitted because of collinearity
Iteration 0: log pseudolikelihood = -67798.349
Iteration 1: log pseudolikelihood = -67286.391
Iteration 2: log pseudolikelihood = -67285.374
Iteration 3: log pseudolikelihood = -67285.374
Logistic regression Number of obs = 104,797
Wald chi2(60) = 224.75
Prob > chi2 = 0.0000
Log pseudolikelihood = -67285.374 Pseudo R2 = 0.0076
(Std. Err. adjusted for 4,502 clusters in ident)
--------------------------------------------------------------------------------
| Robust
employed | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
1.migrant | -.259598 .0554714 -4.68 0.000 -.36832 -.150876
1.l_mig2 | .4030797 .0643309 6.27 0.000 .2769934 .529166
|
migrant#l_mig2 |
0 1 | 0 (empty)
1 1 | 0 (omitted)
|
year |
1950 | 0 (empty)
1951 | -.8324219 1.000956 -0.83 0.406 -2.79426 1.129416
1952 | -.9865726 .5585038 -1.77 0.077 -2.08122 .1080748
1953 | -.8025823 .3977641 -2.02 0.044 -1.582186 -.0229789Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str8 ident double year float(migrant migration) "B0000001" 1991 1 0 "B0000001" 1992 1 0 "B0000001" 1993 1 0 "B0000001" 1994 1 0 "B0000001" 1995 1 0 "B0000001" 1996 1 0 "B0000001" 1997 1 0 "B0000001" 1998 1 0 "B0000001" 1999 1 0 "B0000001" 2000 1 0 "B0000001" 2001 1 0 "B0000001" 2002 1 0 "B0000001" 2003 1 1 "B0000001" 2004 1 1 "B0000001" 2005 1 1 "B0000001" 2006 1 1 "B0000001" 2007 1 1 "B0000001" 2008 1 1 "B0000001" 2009 1 1 end format %ty year
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str8 ident double year float(migrant migration) "C0001002" 1995 0 0 "C0001002" 1996 0 0 "C0001002" 1997 0 0 "C0001002" 1998 0 0 "C0001002" 1999 0 0 "C0001002" 2000 0 0 "C0001002" 2001 0 0 "C0001002" 2002 0 0 "C0001002" 2003 0 0 "C0001002" 2004 0 0 "C0001002" 2005 0 0 "C0001002" 2006 0 0 "C0001002" 2007 0 0 "C0001002" 2008 0 0 "C0001002" 2009 0 0 end format %ty year
Best wishes,
Justyna
0 Response to Difference-in-Difference - collinearity
Post a Comment