Monday, April 22, 2019

Factor variables vs. dummy variables with interactions

Hi all,

I have a dummy variable x1 (no missing values) in the dataset and has values of 0 and 1, and a variable x2 which takes on values of 1 and 2 or missing.

I would like to estimate the effect of x1, x2, and the interaction on the outcome y.

Code:
reg y x1 i.x2 i.x1#i.x2
produces different estimates for the coefficient on x1 than

Code:
reg y i.x1 i.x1 i.x1#i.x2
A simple regression without the interaction produces the same coefficient for x1 whether I use factor notation or not. Does anyone know why this happens or what is actually being estimated in either case?


Thanks for any advice or help you can provide!

No comments:

Post a Comment