Hello Statalist! I'm doing an exercise from Microeconometrics Using Stata, by Cameron and Trivedi, Exercise 11 of Chapter 6 (page 204). "When an endogenous variable enters the regression nonlinearly, the obvious IV estimator is inconsistent and a modification is needed. Specifically, suppose y1 = b*y2^2 + u, and the first-stage equation for y2 is y2 = p*z + v, where the zero-mean errors u and v are correlated. Here the endogenous regressor appears in the structural equation as y2^2 rather than y2. The IV estimator is b_hat_IV = (sum z_i * y2_i^2)^(-1)*(sum z_i * y1_i). This can be implemented by a regular IV regression of y1 on y2^2 with the instrument z: regress y2^2 on z and then regress y1 on the first-stage prediction y2^2_hat. If instead we regress y2 on z at the first stage, giving y2_hat, and then regress y1 on (y2_hat)^2, an inconsistent estimate is obtained. Generate a simulation sample to demonstrate these points. Consider whether this example can be generalized to other nonlinear models where the nonlinearity is in regressors only, so that y1 = g(y2)'beta + u, where g(y2) is a nonlinear function of y2 [y2 being a vector of variables]."

I followed the approach proposed by: https://www.stata.com/statalist/arch.../msg00128.html

clear
set seed 444
quietly set obs 10000
gen double z = 5*rnormal(0)
gen double x = 5*rnormal(0)
matrix C = (1, -0.5 \ -0.5, 1)
corr2data u v, corr(C)
gen double y2 = 3*z + v
gen y2sq = y2^2
gen double y1 = 5 + 2*y2sq + x + u i

vregress 2sls y1 x (y2sq = z), vce(robust) first

** First stage
reg y2 z x
predict y2_hat, xb
generate y2_hat_sq = y2_hat^2

** Second stage
reg y1 y2_hat_sq x, robust
The standard errors and statistics are bigger for the non consistent estimation. But the coefficients are similar.
I'm demonstrating the point just by running this?
Could this be generalized to other nonlinear models?