I will be grateful if anyone here could help me with panel data regression using STATA. I am currently conducting Fixed Effect(FE) panel data regression using longitudinal survey data including two time-series ( wave1 in 2011 and wave2 in 2013). Theoretically, the number of observations in the regression should be an even number, as I only have 2 time periods(2 waves) in my regression. However, the number of observations that participated in the FE regression is an odd number (n=10625, see the screenshot below).
Array
I checked the duplicates in both waves in order to fix the odd number problem in my regression. However, I found only half of my sample have the observations in both two waves (n=5176, see the attached screenshots below). In the screenshot below, the observations that exist in both waves were indicated by 1, while 0 represents the unique observations that only exists in one wave.
Array
I am a bit confused that why STATA could still run the FE regression with missing data in a whole wave. After dropping the unique observations that only exist in one wave, I run the FE again and found STATA comes out with the exact same result except for the number of observations and groups in the regression (see the screenshots of the result below, n=5176).
Array
I am quite confused about how STATA works with the missing data. Should I drop the unique observations and keep only the observations in both waves? I am worried that I might lose a large number of observations and cause bias in my regression. But if I keep the unique observations, the analysis using longitudinal data would seem to be meaningless. I'm wondering if anyone knows why this problem happens /how to deal with this problem if anyone has done this before? Many thanks for that.
0 Response to How STATA Works With Missing Data in Panel Data Regression
Post a Comment