Hi!

I'm analyzing the association of different economic variables with the subjective well-being of the ecuadorian population. I want to see if the regression coefficients of my variables differ between three continental regions which I will call A, B and C in this post for simplicity. One way to do that would be to create subdatasets with the observations of each region (the sample is representative at that level), conduct a regression in each of them and compare the coefficients.That is:

use "Observations_for_A.dta"
reg DV IVs

clear
use "Observations_for_B.dta"
reg DV IVs

clear
use "Observations_for_C.dta"
reg DV IVs


However, that would reduce the number of observations That can be problematic as I am analyzing some variables that don't contribute too much to the R-squared of the model and, thus, the power is important. To avoid that, I thought in creating one dummy variable for each region as follows:

gen not_A =1 if (A==0) ** A=1 if obs. belong to A and 0 otherwise **
recode (.=0)

gen not_B =1 if (B==0) ** The same for B **
recode (.=0)

gen not_C =1 if (C==0) ** The same for C **
recode (.=0)

And then run regressions including interactions between those dummies and every IV, like this

(1) reg DV c.IV1##not_A c.IV2##not_A c.IV3not_A
(2) reg DV c.IV1##not_B c.IV2##not_B c.IV3not_B
(3) reg DV c.IV1##not_C c.IV2##not_C c.IV3not_C

Then, the coefficient of IV1 (alone) in (1) should be interpreted (assuming causality) as the effect of IV1 on the DV when A = 1 and so forth an so on for the other variables in (1), (2) and (3). It might be important to say that all the IVs are continuous and that I'm not really interested in the interpretation of A, B and C alone.

Am I right? What problem can you observe in these specifications? Would this be a good solution to avoid loosing observations?

Thank you in advance!