I am completely new to STATALIST, STATA and some "decent" econometrics, in general. Therefore, please apologise in advance for my ignorance. I couldn't find the answer to what I am struggling with, even though I did find some parts that are in a way related. I believe that following is closest to my question:
https://www.statalist.org/forums/for...on-with-output
What I am doing is a natural experiment on changes in proximity of firms to banks/branches, analysing if it has influenced their business outcomes. I have an unbalanced panel of roughly 15,000 firms for 9 years (2010 to 2018). It is paired with an unbalanced panel of roughly 470 branches, but that part only mattered for distance calculating (getting the values of independent variables). My main variables of interest as “measures of proximity” are: (i) distance to closest branch in terms of km, (ii) distance to closest branch in terms of min, (iii) number of branches within a 10-km radius, (iv) number of branches within a 30-min travel distance, (v) number of branches within a 5-km radius, (iv) number of branches within a 15-min travel distance, etc. In total, I have 8 such variables, 2 of which are floats, and 6 integers. I will refer to those as: x_1, x_2, .. x_8. Since I want to do a multiple regression analysis, I have squared each of those, so I technically have 16 independent variables, making 8 pairs and in each unique regression I want to regress my dependent variable on 1 pair (e.g. x_1, sq_x_1).
I have 20 dependent variables of interest, each of which indicates what I could potentially observe as a firm’s outcome (revenues, profits, number of effective employees…). Each of those is a float. I will refer to those as: y_1, y_2, …, y_20.
In terms of fixed effects, I am still not sure how many of those will I have, but I know that some will be “simple” and some “multiple”. Say I have 5 “simple” fixed effects (like year FE, municipality FE): fe1_1, fe1_2, …, fe1_5, and 3 “multiple” (such as year-industry FE) denoted as fe2_1, fe2_2, fe2_3. So in my regressions, three cases are possible:
- To have one or few “simple” fixed effects
- To have one or few “multiple” fixed effects
- To have a combination of one or few “simple” and “multiple” fixed effects
Since this is the first time ever that I am doing panel data analysis, I would really appreciate any comment both on syntaxes as well the way I use them (possible options, etc.)... This is how I think I should start:
Code:
global id id_firm global t year global ylist y_1 y_2 y_3 … y_20 global xlist x_1 sq_x_1 … x_8 sq_x_8 global felist .fe1_1 … fe1_5 .fe2_1 .fe2_2 .fe2_3 sort $id $t xtset $id $t xtdescribe xtsum $id $t $ylist $xlist
Code:
foreach y of $ylist { foreach fe of $felist { forvalues i=1(2)16 { forvalues j=1/2 { local v`j' : word `i' of `$xlist' local ++i } } } xtreg `y' `v1' `v2' `fe', fe * or * xtreg `y' `v1' `v2' `fe', fe vce(robust)? * For saving output, could it be something like this: outreg2 sum using file, tex append * I would prefer my output both in tex and dta format }
Please if anyone could help me with making these loops work as well as figuring out a way to store the output. I would prefer storing only individual coefficients on the fixed effects.
The final question I have is should I run Hausman test only once for my whole panel data, or I should do it in each regression. Looking at this syntax and by my understanding of it, it is enough to do it once for the dataset:
Code:
quietly xtreg $ylist $xlist, fe estimates store fixed quietly xtreg $ylist $xlist, re estimates store random hausman fixed random
MANY THANKS IN ADVANCE TO EVERYONE WHO FINDS SOME TIME TO HELP!
JR
0 Response to Nested foreach and forvalue for fixed effect analysis (using xtreg), with saving outputs
Post a Comment