Hello,

i'm kinda new to stata and empirical research so sorry if this question is very basic.

1: So i have observations for different companies for different years. I have a dependent variable and several independent variables. I would like to include their industry aswell as the year in which the observations was recorded as a fixed variable.
To my understanding, one does that by going

xtset industry year
xtreg DV IV1 IV2...., fe

However i have repeated time values within this data so that doesnt work with xtset. Do i just do that by combining the industry and year into one variable and then doing the regression or do i miss something substantial?

egen industry_year = group(industry year)
xtset industry_year
xtreg DV IV1 IV2...., fe


2: As a regression result for Rsquared the overall score is used when doing xtreg, right?


3: In my mind, simply inserting the different years and industries as a dummy variable in the regression should yield the same result as in 1.

So just doing:

regress DV IV1 IV2..... year1 year2 year3.....industry1 industry2.....

should be the same as:

xtset industry_year
xtreg DV IV1 IV2...., fe


But doing so, i get a slightly different result. Why is that, or is my appraoch in 1 flawed?




Thank you in advance for your answer and sorry, if these questions are basic