Hello all,
I am performance ols regression, something like: regress y i.x1##i.x2 x3 x4 x5, cluster(i)
Sample is 20k observations.
Y is a logged count variable and is not a rare event
x1 is a dichotomous variable that is not super common, but I don't think is problematic (takes on the value of 1 for 600 out of 20000 obs)
However, x2 is a rare event (the variable is mostly 0s with a few 1s).
I am interacting x1 and x2 because the interaction is of theoretical interest for my study. However, there are only about 20 observations for which both x1 and x2 = 1 simultaneously. Is this a concern for proceeding with the study I want to conduct? There is a valid structural explanation for why it is only 20 obs where x1 and x2 = 1, but I want to be sure that I am embarking on this course of study with sound footing.
Thanks for your input!
0 Response to OLS regression where one independent variable is a rare event
Post a Comment