Dear Stata Community:

I am new to Stata, and have begun gathering information as to how to run fixed effects regression models. I believe the xtreg command and reghdfe syntax can accomplish this

Let’s say my panel data has the following three main variables
  • Firm (e.g. 1,000 firms: Firm 1 to Firm 1000)
  • Year (firm-level data is collected over several years: 1980 to 2000)
  • Industry (i.e., different firms belong to a specific industry that does not change over time)
My questions (relate to the Models shown at the bottom of this post) .... Please note I've gone through the several posts in the Stata forum, but have a few questions. Any help would be greatly appreciated.

Question 1

Are Models 5 and Models 6 the same?


In particular, is the interaction variable (industry# year) in Model 5 the SAME as the group variable (industry_year) in Model 6?

From my understanding:
  • Models 5 and Model 6 DO CONTROL for firm fixed effects (FE)
  • Models 5 and 6 do NOT control for year fixed effects
However:
  • Models 5 controls for the interaction variable: industry & year (i.e. i.industry#i.year)
  • Model 6 controls for the group variable (industry_year)
Question 2

For both Models 5 and Model 6, can I include i.year (in Model 5a) and year (in Model 6a) to control for year fixed effects? That is:


Model 5a
xtset firm;
xtreg dependent_variable independent_variables i.industry#i.year i.year, fe;

Model 6a

egen industry_year = group(industry year);
reghdfe dependent_variable independent_variables, absorb(firm year industry_year);

Question 3

Should the panelid (in xtset <panelid>) be ideally always set to the highest aggregate level? (or that just depends on the research question?)



Question 4

Why do researchers run an industry and year fixed effects model (e.g., Model 3 and Model 4) when a fixed and year fixed effects model (i.e, Models 1 and 2) is much superior?

(assuming that the dataset contains three-level data structure: repeated observations over time nested within firms which are nested within industries)


That is, since industry is time-invariant within firms, the firm fixed effect includes the industry fixed effect, so the firm fixed effects model is the more robust specification (controls for more unobservable factors including time invariant industry-level unobservable).

As I’m new to stata and multiple fixed effects regression models, forgive my ignorance in the above questions

I look forward to hearing from the STATA community for any great insights.

Thanks,
Reuben

A: Two-way fixed effects model (with firm and year fixed effects)

If I wanted to run a regression with firm and year fixed effects, I would run the below:

Model 1 (using xtreg)

xtset firm year
xtreg dependent_variable independent_variables i.year, fe;

Model 2 (using reghdfe)

reghdfe dependent_variable independent_variables, absorb(firm year);

B: Two-way fixed effects model (with industry and year fixed effects)

Model 3 (using xtreg)

xtset industry year
xtreg dependent_variable independent_variables i.year, fe;


Model 4 (using reghdfe)

reghdfe dependent_variable independent_variables, absorb(industry year);

C: Multiple fixed effects model (with industry-year FE and firm FE)

Model 5 (using xtreg)

xtset firm;
xtreg dependent_variable independent_variables i.industry#i.year, fe;


Model 6 (using reghdfe)

egen industry_year = group(industry year);

reghdfe dependent_variable independent_variables, absorb(firm industry_year);