Dear all,

I am running a linear probability model on stata 15 using a mix of firm-level data and country-level data. For example, my dependent is a dummy (innov), varying by firm, country and year of survey. Some of my regressors are also firm-level variables, others are country-level variables (varying only by country and year of survey). Below is an excerpt of my dataset.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 code float id int year byte(innov RD) double trade float REGION byte(REGION0 REGION1) float(tourXREGION1 tourXREGION2)
"ALB" 3 2019 0 .                 . 2 0 0 . .
"ALB" 3 2019 0 .                 . 2 0 0 . .
"ALB" 3 2019 0 .                 . 2 0 0 . .
"ARG" 4 2006 1 0 40.43347987191512 4 0 0 0 0
"ARG" 4 2006 1 1 40.43347987191512 4 0 0 0 0
"ARG" 4 2006 1 0 40.43347987191512 4 0 0 0 0
"ARG" 4 2006 1 0 40.43347987191512 4 0 0 0 0
"ARG" 4 2006 1 1 40.43347987191512 4 0 0 0 0
"ARG" 4 2006 . . 40.43347987191512 4 0 0 0 0
end
label values innov H1
label def H1 1 "Yes", modify
label values RD H8
label def H8 1 "Yes", modify
I have included regional dummies (REGION) and interactions with my variable of interest (tour) in the model. Below are the observations of the variable REGION.
REGION Freq. Percent Cum.
0 15,323 9.65 9.65
1 40,397 25.44 35.09
2 17,531 11.04 46.13
3 3,782 2.38 48.52
4 32,331 20.36 68.88
5 4,603 2.90 71.78
6 835 0.53 72.30
7 43,979 27.70 100.00
Total 158,781 100.00

I am running the following model using factor variables and adding country dummies (i.id) and years dummies(i.year) as well:
Code:
reg innov i.REGION##c.tour RD size_num cert l_gdp_gr fdi_s_gdp trade i.year i.id, r
When doing so, (almost) all interaction terms are omitted as shown below, and the reason that is given is collinearity.
Array

However, when I generate the interactions manually and run the same model as follows, the interactions terms are not omitted (please see below.)
Code:
reg innov REGION1 REGION2 REGION3 REGION4 REGION5 REGION6 REGION7 tour tourXREGION1 tourXREGION2 tourXREGION3 tourXREGION4 tourXREGION5 tourXREGION6 tourXREGION7 RD size_num cert l_gdp_gr fdi_s_gdp  trade  i.year i.id, r
Array

Then, I was wondering if it might happen that regressions using factor notations do not yield results under some circumtances, and if I could trust my results obtained from generating the interactions manually.

I thank you for your clarifications,

Best,

Assi