Hi there
I am hoping someone can advise me on this complex dataset that is derived from a dual frame complex sample design that is provided by the IRS and conducted by the Fed Reserve, line can be found at https://www.federalreserve.gov/econres/scfindex.htm. Because of the large missing data, the Feds imputed replacement values for users beforehand and released five replicate datasets that inlaid these multiply-imputed values. Hence, the apparent sample size of approx 28885 is actually only 5777. They provided a replicate weight dataset which I then created an average weight to normalize the population weight to reflect actual sample. The variable x42001 was given as the population weight (proportions representing actual population).
I first generated a new weight variable, nwgt, by dividing x42001 by the product of the average of weights multiplied by 5
*this nwgt variable is the population normalized version of x42001 - these figures are population weighted
gen nwgt=0
replace nwgt=x42001/(22268.03*5)
However, I am having 3 issues:
1. While I am able to reflect the weight for descriptive statistics for categorical variables, I am not able to do so for continuous variables. For e.g. On home ownership, if I tab townhome[iweight=nwgt] , I get N=5777 which is what I wanted. But when I tabstat age[iweight=nwgt], stat(count mean sd p50 min max) an error message appeared: iweights not allowed
2. there is also another variable weights in the dataset since the data oversampled the wealthy and white population. aweight=wgt and I want to showcase weighted vs unweighted dataset so how could I combine both iweight and aweight in the same line of command?
3. When I run the analysis, in this case, multinomial, I can't seem to use the iweights command either. mlogit risktol age i.gender i.townhome [iweight=nwgt]
Would be grateful for advice.
thank you.
Yours truly
LG
Related Posts with Reducing repeated responses to find actual sample size from a multiple imputed dataset
Panel 2sls with multiple interations of endogeneous variablesI´m using Stata 14 with Windows 10 OS. I have 2 endogenous and 8 exogenous variables. I need to run…
Measuring Cumulative Density/Area Under Kernel Density CurveThe objective of my analysis is calculate non-compliance to minimum wages at the industry-state-year…
Hansen TestIn Roodman's 2009 paper, he notes that a 'very low instrument count also weakens the Hansen test'. I…
"intreg" and "cmp" give different results for interval regressionHello! I am trying to do an interval regression, because my outcome variable (wage) is measured in 9…
problem in a model because of collinear variables Hi I am estimating an environment kuznets curve, which generically can have the following specifica…
Subscribe to:
Post Comments (Atom)
0 Response to Reducing repeated responses to find actual sample size from a multiple imputed dataset
Post a Comment