I am trying to create some fake data, where x = z * 0.466 + e_x, and x follows a normal distribution with mean 1.377 and SD of 0.294. How can I generate z so that if I regress x on z I get the expected coefficient (i.e. 0.466)? What is wrong with the following procedure? Perhaps it has to do with the miniminazion of the error term? I guess I should revise my statistics book...
Code:
clear
set obs 5000
gen x = rnormal(1.377, 0.294)
gen e_x = rnormal(0)
gen z = ( x - e_x ) / 0.466
/* I got the above formula by solving the following equation for z => x = z * 0.466 + e_x */
reg x z
Source | SS df MS Number of obs = 5,000
-------------+---------------------------------- F(1, 4998) = 386.63
Model | 30.8533487 1 30.8533487 Prob > F = 0.0000
Residual | 398.844143 4,998 .079800749 R-squared = 0.0718
-------------+---------------------------------- Adj R-squared = 0.0716
Total | 429.697492 4,999 .08595669 Root MSE = .28249
------------------------------------------------------------------------------
x | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
z | .03539 .0017998 19.66 0.000 .0318615 .0389184
_cons | 1.273169 .0066615 191.12 0.000 1.26011 1.286229
------------------------------------------------------------------------------
0 Response to Creating predictor in fake data with some known parameters
Post a Comment