I am exploring what happens when two regressors are collinear, and at least one is simultaneous with the dependent variable (which might occur, for example, when regressing unemployment and its square on crime, where crime causes unemployment and unemployment causes crime, possibly through a missing third variable). In real life such a regression makes no sense, but that will not stop people from doing it. And Stata gives a result, a strange one..
I model the general situation this way on Stata 15:
set obs 100000
g a =rnormal(10)
g b =rnormal(10
g d=rnormal(10)
g y=a+b
g x=a+d
g z=x+b*.01
reg y x z;
The result makes no sense:
Source | SS df MS Number of obs = 100,000
-------------+---------------------------------- F(2, 99997) > 99999.00
Model | 149553.241 2 74776.6205 Prob > F = 0.0000
Residual | 50081.9422 99,997 .500834447 R-squared = 0.7491
-------------+---------------------------------- Adj R-squared = 0.7491
Total | 199635.183 99,999 1.9963718 Root MSE = .7077
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | -99.33594 .2235979 -444.26 0.000 -99.77419 -98.8977
z | 99.8365 .2235935 446.51 0.000 99.39826 100.2747
_cons | .0064508 .0389119 0.17 0.868 -.069816 .0827176
------------------------------------------------------------------------------
The correlation between x and z is .9999.
The coefficients get higher with more collinearity, and lower with less.
I did the same regression on my 20 year old SAS. The results were completely different, with coefficients of one.
What causes the expanding coefficients in Stata?
Related Posts with Odd results when combining near collinearity with simultaneity
Fundamental Question on Panel regression Pooled OLS (Driscoll-Kraay Est.) and Fixed EffectHello, I have several questions regarding the fundamental approach of my regression model. I have a…
shapley returns "variable __weigh not found"Dear all, I wanted to replicate the example in the help file of the shapley.ado. But it returns "va…
High coefficients values of the interaction termsHi experts and researchers, I am working on a Panel data model using system GMM, xtabond2 command. …
margins ar.variable VS contrast ar.variable, effectsAfter running xtreg on a model with a continuous dependent variable, I want to test the differences …
How do I calculate the number of workers in the household?Hi, I am working on panel data, and I want to know how many workers are in the household. An exampl…
Subscribe to:
Post Comments (Atom)
0 Response to Odd results when combining near collinearity with simultaneity
Post a Comment