Dear all,

I am currently examining the impact of annual average sunset time on sleep duration of children in 4 developing countries.
I have a panel data set over 3 years (2009, 2013 and 2016).
The variable "sleep" denotes the hours per day allocated to sleep by child i in country c in studysite s at time t.
The variable "annual average sunset" only varies at studysite level, so it denotes the average annual sunset time in studysite s in country c.

I ran the following regressions:

*OLS
eststo m1: regress sleep annual_avg_sunset if in_model_3==1
estadd local fe No
estadd local fe_ No

*OLS with control variables
eststo m2: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year, vce(robust)
estadd local fe Yes
estadd local fe_ No

*FE(country&year with robust SE)
eststo m3: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)
estadd local fe Yes
estadd local fe_ Yes

*FE(country&year SE clustered at the studysite year level)
eststo m4: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(cluster studysite_year)
estadd local fe Yes
estadd local fe_ Yes

Array



Model (3) uses robust standard errors, Model (4) uses clustered standard errors at the studysite_year level. In Model (4) my coefficient on annual average sunset time becomes insignificant and I get a large standard error.

Now I am wondering if I should cluster standard errors and if so, at what level. Does it make sense to cluster it at the studysite_year level in my exampl

Thank you,

Barbara