I am working with DHS (Demographic and Health Survey) data. I have pooled data from about 25 countries taking 2 most recent waves from each country. My dependent variable is neghaz (negative of height for age (cm/months)) which is continuous in nature. My regression specification includes several control variables including square terms and interaction terms. The specification also includes variables that have been calculated at PSU/cluster level (mean employment rate in the cluster, etc). I have certain doubts regarding weighting and clustering.
1. The data has widely varying no. of observations with India having more than 200000 valid observations while African countries only having a few thousand observations. Shall I use weights to perform the regression analysis in this case?
I performed the regression both ways i.e., with applying weights and without applying weights. The standard errors changed significantly. Is the weighted regression a better choice for this analysis? (reg neghaz $controlset [pweight = perweight], cluster(psuid)). The weights that i am using have already been de-normalized.
2. I read on some forums that while pooling data from countries clustering shall be done at country level as well as PSU/cluster level. Some people recommended using fixed effects by adding an "i.survey" to my model specification. On the other hand some people recommended using a hierarchical model to take account of clustering at multiple levels. Which model shall I use for analysis and why? I have read some literature in this regard but i only got confused.
When I ran the regression with an i.survey term both the coefficients and standard errors changed.
I am sorry if my queries sound noobish. This is my first time with pooled data from so many surveys.
Related Posts with Weighting and Clustering in Pooled Data Analysis
Ytitle with euro sign does not display horizontallyDespite putting angle(0) the title will not be displayed horizontally. Since this only happens for t…
Missing Time-Series DataHi Statalist, I have a time series dataset of ozone data for multiple different counties. It has qu…
LSDVC methodAfter regressions with LSDVC method stata indicates bootsraps error. I want to know what is differnc…
scatterplotI would like to do several scatterplots were I plot the correlation btw mortality and covid-cases an…
Merging multiple .tex files: How to sort the individual files?I am outputting a lot of tex tables as raw files and then merge them into one ordered tex document u…
Subscribe to:
Post Comments (Atom)
0 Response to Weighting and Clustering in Pooled Data Analysis
Post a Comment