Hi all,

I have a dummy variable x1 (no missing values) in the dataset and has values of 0 and 1, and a variable x2 which takes on values of 1 and 2 or missing.

I would like to estimate the effect of x1, x2, and the interaction on the outcome y.

Code:
reg y x1 i.x2 i.x1#i.x2
produces different estimates for the coefficient on x1 than

Code:
reg y i.x1 i.x1 i.x1#i.x2
A simple regression without the interaction produces the same coefficient for x1 whether I use factor notation or not. Does anyone know why this happens or what is actually being estimated in either case?


Thanks for any advice or help you can provide!