Dear all,

I have a question with regard to Instrumental variable regression.

My sample is identified at the firm-year level (combination of variable firm_id and year uniquely identifies my observations). My dependent variable (Y) also varies at that level (firm-year). Now, I am interested in the effect of a variable (X) that varies only by country. Also there is no variation in the combination of firm_id and country.

In an attempt to establish a causal link I used Stata command ivreg2 and included a new variable (Z) as an instrument for X. This variable Z also varies at the country level (same as X).

My commands look like this:

Code:
ivreg2 Y (X=Z), savefirst
This leads to approximately 13000 observations in the first stage and the second stage. First stage and second stage coefficients seem reasonable.

Now, I received a comment that this is not the correct way to approach the Instrumental variable estimation. The comment says that I should run the First Stage regression on the country level only because X and Z only vary at this level. This would yield the correct coefficient that can be included in the second stage. I would end up with a first stage with approx. 30 observations and a second stage with 13000 observations. I did this "by hand" by merging the predicted values of X from a first regression on the country level to my firm-year sample. Then I included the predicted values in a simple OLS (reg). I am fairly sure that this is not correct because standard errors are incorrect when using predicted values in a simple OLS regression.

My question: Is there a way to approach this problem in Stata with ivreg or ivreg2? Or else, is there a way to do this "by hand" but handle the problem of using predicted values?

Thank you all very much in advance and best wishes,
Simon