I am working with DHS (Demographic and Health Survey) data. I have pooled data from about 25 countries taking 2 most recent waves from each country. My dependent variable is neghaz (negative of height for age (cm/months)) which is continuous in nature. My regression specification includes several control variables including square terms and interaction terms. The specification also includes variables that have been calculated at PSU/cluster level (mean employment rate in the cluster, etc). I have certain doubts regarding weighting and clustering.
1. The data has widely varying no. of observations with India having more than 200000 valid observations while African countries only having a few thousand observations. Shall I use weights to perform the regression analysis in this case?
I performed the regression both ways i.e., with applying weights and without applying weights. The standard errors changed significantly. Is the weighted regression a better choice for this analysis? (reg neghaz $controlset [pweight = perweight], cluster(psuid)). The weights that i am using have already been de-normalized.
2. I read on some forums that while pooling data from countries clustering shall be done at country level as well as PSU/cluster level. Some people recommended using fixed effects by adding an "i.survey" to my model specification. On the other hand some people recommended using a hierarchical model to take account of clustering at multiple levels. Which model shall I use for analysis and why? I have read some literature in this regard but i only got confused.
When I ran the regression with an i.survey term both the coefficients and standard errors changed.
I am sorry if my queries sound noobish. This is my first time with pooled data from so many surveys.
Related Posts with Weighting and Clustering in Pooled Data Analysis
Calculate total grants company_id year grant co_001 2003 1 co_001 2004 0 co_001 2005 1 co_001 2005 1 co_002 20…
Calculate total grants company_id year grant co_001 2003 1 co_001 2004 0 co_001 2005 1 co_001 2005 1 co_002 20…
GMM xtabond2 with external IVsHello miracle makers, I would like to use GMM and external IVs at the same time and struggling to s…
Xtreg results in insufficient observationsHi everyone, I am currently running a regression where I am measuring the impact of factors on the r…
Can't correctly convert timeseries time variable daily to monthly format in STATA 17.Hello, I am a novice in stata, but I have spent several hours trying to solve this issue without any…
Subscribe to:
Post Comments (Atom)
0 Response to Weighting and Clustering in Pooled Data Analysis
Post a Comment