Hi:
This is not a stata related question, so please forgive me if this is not allowed.
I am dealing with the NHIS database which has a complex survey design with clustering, stratification and oversampling of certain sub-population. The psu's are mainly counties or contiguous counties which are later stratified based on MSA status. The NHIS however only provide pseudo strata and pseudo psu codes for confidentiality reasons. For the survey period that I am interested in, there were 304 strata and 482 psu's. However, there are 300 pseudo-strata, each containing 2 pseudo psu's-so 600 pseudo psu's in total. My confusion stems from the fact that in the manual, they said that the pseudo psu's were constructed by collapsing the original psu's to create bigger clusters so that it would be more difficult to identify any given clusters. If that is the case then how come there are more pseudo psu's then the original ones?
I am trying to include some measure of area specific fixed effects in my panel regression and I was thinking of using the pseudo-psu's as a proxy for geographic area. It says in the above paper that, "a given geographic area within a given NHIS sample PSU should have the same set of Pseudo-Stratum and Pseudo-PSU codes assignments if it is present in more than one NHIS annual microdata file." Doesn't that imply that the original psu's are broken down into psudo-psu's which explains why there are more pseudo psu's than original ones? Then why does it say in the manual that the psu's are merged or collapsed?
I have attached a link to their manual.http://www.asasrms.org/Proceedings/y2007/Files/JSM2007-000353.pdf.
I would be really grateful if any kind soul could help me out!
Related Posts with Questions about pseudo-strata/psu in complex survey design
Estout: How to get rid of "Main" on top of the variable list when using esttab?Hi all, I want to get rid of the "main" writing on top of my list of variables in the regression tab…
Creating an adjacency matrix from two columnsDear Statalists, I have 2 columns (id and course) I would like to create two types of adjacency ma…
percentage change of lagged variableI have another related query on generating lagged variables. My regression model looks like this: In…
dividing variable with lagged variableI am getting back to STATA after a very long time and is having some difficulties. I want to create …
how to realize the LMtest and wald test of spatial panel data in stata?hey everyone, do you know how to realize the LMtest and wald test of spatial panel data in stata? an…
Subscribe to:
Post Comments (Atom)
0 Response to Questions about pseudo-strata/psu in complex survey design
Post a Comment