Hello all,


I am performance ols regression, something like: regress y i.x1##i.x2 x3 x4 x5, cluster(i)

Sample is 20k observations.

Y is a logged count variable and is not a rare event

x1 is a dichotomous variable that is not super common, but I don't think is problematic (takes on the value of 1 for 600 out of 20000 obs)

However, x2 is a rare event (the variable is mostly 0s with a few 1s).

I am interacting x1 and x2 because the interaction is of theoretical interest for my study. However, there are only about 20 observations for which both x1 and x2 = 1 simultaneously. Is this a concern for proceeding with the study I want to conduct? There is a valid structural explanation for why it is only 20 obs where x1 and x2 = 1, but I want to be sure that I am embarking on this course of study with sound footing.


Thanks for your input!