and my code is
areg y treat1996-evergm1999 i.year, absorb(id)
I plot the coefficient against year and the coefficient is the difference in y between the treat and control group
suppose my previous data look like this, very balanced
id | year | y | treated or not | treat1996 | treat1997 | treat1998 | treat1999 |
1 | 1996 | 250 | 1 | 1 | 0 | 0 | 0 |
1 | 1997 | 132 | 1 | 0 | 1 | 0 | 0 |
1 | 1998 | 131 | 1 | 0 | 0 | 1 | 0 |
1 | 1999 | 175 | 1 | 0 | 0 | 0 | 1 |
2 | 1996 | 165 | 0 | 0 | 0 | 0 | 0 |
2 | 1997 | 242 | 0 | 0 | 0 | 0 | 0 |
2 | 1998 | 231 | 0 | 0 | 0 | 0 | 0 |
2 | 1999 | 213 | 0 | 0 | 0 | 0 | 0 |
3 | 1996 | 165 | 1 | 1 | 0 | 0 | 0 |
3 | 1997 | 242 | 1 | 0 | 1 | 0 | 0 |
3 | 1998 | 231 | 1 | 0 | 0 | 1 | 0 |
3 | 1999 | 213 | 1 | 0 | 0 | 0 | 1 |
4 | 1996 | 165 | 0 | 0 | 0 | 0 | 0 |
4 | 1997 | 242 | 0 | 0 | 0 | 0 | 0 |
4 | 1998 | 231 | 0 | 0 | 0 | 0 | 0 |
4 | 1999 | 213 | 0 | 0 | 0 | 0 | 0 |
but then i deleted 2 obs (marked as red)
id | year | y | treated or not | treat1996 | treat1997 | treat1998 | treat1999 |
1 | 1996 | 250 | 1 | 1 | 0 | 0 | 0 |
1 | 1997 | 132 | 1 | 0 | 1 | 0 | 0 |
1 | 1998 | 131 | 1 | 0 | 0 | 1 | 0 |
1 | 1999 | 175 | 1 | 0 | 0 | 0 | 1 |
2 | 1996 | 165 | 0 | 0 | 0 | 0 | 0 |
2 | 1997 | 242 | 0 | 0 | 0 | 0 | 0 |
2 | 1998 | 231 | 0 | 0 | 0 | 0 | 0 |
2 | 1999 | 213 | 0 | 0 | 0 | 0 | 0 |
3 | 1996 | 165 | 1 | 1 | 0 | 0 | 0 |
3 | 1997 | 242 | 1 | 0 | 1 | 0 | 0 |
3 | 1998 | 231 | 1 | 0 | 0 | 1 | 0 |
3 | 1999 | 213 | 1 | 0 | 0 | 0 | 1 |
4 | 1996 | 165 | 0 | 0 | 0 | 0 | 0 |
4 | 1997 | 242 | 0 | 0 | 0 | 0 | 0 |
4 | 1998 | 231 | 0 | 0 | 0 | 0 | 0 |
4 | 1999 | 213 | 0 | 0 | 0 | 0 | 0 |
and i run the above code again, the coefficients changed (which is not surprising)
but i dont know what did Stata do in this regression
i furthur run the code and find no obs were deleted from the second regression.
tab id if e(sample)
tab id if !e(sample)
. tab id if !e(sample)
no observations
then, why the coefficient is different??
0 Response to what did Stata do when we run a regression on unbalaned data?
Post a Comment