Before I begin my analysis I remove wave 4 as there is a lack of response in this wave:
Code:
drop if wave==4
I would like to analyse the data in a fixed effects analysis, but the data providers have suggested I apply a wave 3 weight they provide to make the sample representative of the national child population.
xtlogit will not allow me to apply weights so instead I do the following:
Code:
clogit child_overweight_y parents_unemployed_y i.urban_or_rural_y child_age_y [pw=weighting_factor], group(id) nolog robust margins, dydx(parents_unemployed_y) post estimates store logitmod estimates table logitmod, star stats(N r2 r2_a)
Code:
. clogit child_overweight_y parents_unemployed_y i.urban_or_rural_y child_age_y [pw=weighting_factor], gro
> up(id) nolog robust
note: multiple positive outcomes within groups encountered.
note: 7,150 groups (20,713 obs) dropped because of all positive or
all negative outcomes.
Conditional (fixed-effects) logistic regression
Number of obs = 5,341
Wald chi2(3) = 51.26
Prob > chi2 = 0.0000
Log pseudolikelihood = -1939.9054 Pseudo R2 = 0.0199
(Std. Err. adjusted for clustering on id)
--------------------------------------------------------------------------------------
| Robust
child_overweight_y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------------+----------------------------------------------------------------
parents_unemployed_y | .4196281 .1226047 3.42 0.001 .1793273 .659929
1.urban_or_rural_y | .1462153 .1653259 0.88 0.376 -.1778174 .4702481
child_age_y | -.0089368 .0013886 -6.44 0.000 -.0116585 -.0062151
--------------------------------------------------------------------------------------
. margins, dydx(parents_unemployed_y) post
Average marginal effects Number of obs = 5,341
Model VCE : Robust
Expression : Pr(child_overweight_y|fixed effect is 0), predict(pu0)
dy/dx w.r.t. : parents_unemployed_y
--------------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
---------------------+----------------------------------------------------------------
parents_unemployed_y | .1024646 .029751 3.44 0.001 .0441537 .1607755
--------------------------------------------------------------------------------------
. estimates store logitmod
. estimates table logitmod, star stats(N r2 r2_a)
------------------------------
Variable | logitmod
-------------+----------------
parents_un~y | .10246458***
-------------+----------------
N | 5341
r2 |
r2_a |
------------------------------
legend: * p<0.05; ** p<0.01; *** p<0.001
I would like to cluster the standard errors by the child's location but urban_or_rural_y is the closest variable I have to location, referring to whether the child lives in an urban or rural region and is a binary variable as below:
Code:
. tab urban_or_rural_y
urban_or_ru |
ral_y | Freq. Percent Cum.
------------+-----------------------------------
0 | 17,091 57.34 57.34
1 | 12,713 42.66 100.00
------------+-----------------------------------
Total | 29,804 100.00
.
Where 0 is urban and 1 is rural. When I try to include this cluster I get the following outcome:
Code:
groups (strata) are not nested within clusters
I want to look at whether parental employment increases the probability of being overweight, so above I take this result as indicating that parental employment increases the probability of being overweight by 10%, i.e. as either parent goes from employed to unemployed the probability of the child going from a normal to overweight increases by 10%
Having done that I would like to know if either parent being unemployed increases the z-score, as I feel that a larger z-score implies a child is further from the mean and closer to being overweight if the score is positive and large, so I do the following:
Code:
xtreg z_score_bmi parents_unemployed_y i.urban_or_rural_y child_age_y [pw=weighting_factor], fe
Which gives me the following result:
Code:
. xtreg z_score_bmi parents_unemployed_y i.urban_or_rural_y child_age_y [pw=weighting_factor], fe
Fixed-effects (within) regression Number of obs = 26,054
Group variable: id Number of groups = 8,972
R-sq: Obs per group:
within = 0.0089 min = 1
between = 0.0000 avg = 2.9
overall = 0.0024 max = 3
F(3,8971) = 30.52
corr(u_i, Xb) = -0.0192 Prob > F = 0.0000
(Std. Err. adjusted for 8,972 clusters in id)
--------------------------------------------------------------------------------------
| Robust
z_score_bmi | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------------------+----------------------------------------------------------------
parents_unemployed_y | .1075761 .0263291 4.09 0.000 .055965 .1591872
1.urban_or_rural_y | .0516994 .0344913 1.50 0.134 -.0159113 .1193102
child_age_y | -.0026084 .0003005 -8.68 0.000 -.0031974 -.0020193
_cons | .8034086 .0191922 41.86 0.000 .7657874 .8410298
---------------------+----------------------------------------------------------------
sigma_u | .86341018
sigma_e | .77595763
rho | .55319391 (fraction of variance due to u_i)
--------------------------------------------------------------------------------------
Which I take as indicating that as either parent becomes unemployed the child's weight increases by a tenth of a standard deviation.
Does my approach, and understanding of my results make sense?
I would hate to make a mistake and would really appreciate if anyone could point out my mistakes now so that I could correct them at the beginning of my study and do better!
Thank you so much,
John
0 Response to Does my fixed effects regression set-up make sense?
Post a Comment