Hello,

when in comes to repeated cross sections it seems there is always little literature out there.

I am doing a binary logit model using repeated cross-section data across 19 years (2001 to 2019).
I am currently logit regressing the independent dummy variable "repairing an item" ( =1 vs replacing=0) over some casual regressors like age, educ, price etc plus 19 year dummies.
I have set 2001 as base year.
I have total 8436 observations all from the same country A and 24 regressors with this setting. Although I am not sure this regression is useful for my purposes as well as I am not sure how to interpret the time coefficients.

The aim of my analysis is twofold:
1. focusing on the signs of the causal regressors (age, educ etc.) and see if those match the signs predicted by my theoretical model
2. tracking how repairing vs replacing an item_i has trended over the years in country A.

Questions with respect to:
1.
1.a. I have set my base year by dropping the time dummy d2001 and this way 2001 has become my reference year in Stata. Can you confirm that?

1.b. If the latter is true, then the coefficients obtained for the causal regressors (age, educ etc.) should refer to the base year d2001 if I am not wrong, while the dummies for the other years should top-up the constant coefficient from the regression, in this case, _cons= -2.62841, (eg d2006 = .59016, so if I want the cons for year 2006 I can obtain it by summing up _cons+d2006= -2.62841+.59016= some number). If I want to obtain the coeff. for each causal regressors and for each year (ex. age for year 2006, educ for year 2006,...,.age for year 2019, educ for year 2019) I would have to interact each year with each casual regressors and include it into my regression but the number of regressors would then become something like 18^6 number of regressors (18 years each interacted with the each of the 6 casual regressors). It doesn't look optimal and I am not sure my whole interoperation would even be correct.

1.c. If my interpretation of the coefficients in 1.a and 1.b. is correct then how is it possible that when I change the base year, the casual regressors coefficients do NOT change?

1. d. To compute the average partial effects (APEs) for each for casual regressors of main interest, what are the stata functions to do it in STATA? Specifically, if I want to compute the APE of Education at saving_rate=10% and in the year 2002 is the following the right command on STATA?

margins, dydx(HighEd1) at (sav_rate=10 d2002=1 d2003=0 d2005=0 d2004=0 d2006=0 d2007=0 d2008=0 d2009=0 d2010=0 d2011=0 d2012=0 d2013=0 d2014=0 d2015=0 d2016=0 d2017=0 d2018=0)

I obtain a APE = .0045843, Interpretation: 1 more grade of schooling increases probability of repairing by 0.0046 percentage points. Is this correct?

1.e. What if instead I want the APE of Education, same as above, but across the years? Shall I tend simply "switch on" the dummies for the years ie dyear=1, set them all equal to one?


2.
2.a. With the same setting as in 1. and in order to track how of repair has changed over time, I though I could do calculate the APE of each year: in this case I was wondering if it makes any sense to compute the margins for each year eg. margins, dydx(d2006), but I guess not.

2.b. An alternative and more compact approach to 1. where I insert all the time dummies, it would maybe be to group up the years in maybe 3 half-decades and then run a regression with all 3 time dummies and the interact terms of each time dummy within the causal regressors. I could compute the margins for the main variables of interest for all the years.

2.c. What is the right approach to put group up the years in macro-periods to be fed into the regression? I was thinking to maybe plot the %of repair (over the total=repair+ replacement) for each year over the years and so from year 2001, group up until example 2004 if I see that the % of repair has been increasing from 2001 to 2004, while if in 2005 the % drops I would group from 2005 till the year it starts growing again. Does this sound feasible or it is better to group by half-decades?

Sorry for the long questions; please notice that if you have a general idea for tackling my initial objectives 1. and 2. ,then there is no need for you to go and correct each of the potentially wrong statements I make in the subpoints, but I would be happy to hear the alternative approach you propose.


Thank you very much!


Best,

Linda