Dear Stata-community,
I am working on a dataset with information on household expenditures. My dataset is small, with 101 observations, and some variables have missing information, at random. I do have information on whether the household had a specific expenditure under variable T3_2_1_`x'_B7 == 1, with 1 standing for "yes". Information was collected on expenditures per month.
For some of the expenditure items there is missing data, which I am trying to impute depending on the household size and household income (T1_A1 and agr_nonagr_offfarm_inc, respectively). My ultimate objective is to get the complete data to be able to calculate expenditure per year, so the final result need to be multiplied by 12.
I first created a local macro with the variables for which I have missing information ("costs_1m"). Then I set up the mi commands in a loop. The Stata prompt showed that the values have been imputed (according to my filter T3_2_1_`x'_B7 == 1). I created a new variable because I didn't want to mess with my original data (which of course I have additionally saved elsewhere). Right after, I used another loop to make the calculation for the annual expenditures for each of the variables which had data imputed.
So far for what I did. My basic question is how to proceed now, because I don't have my dataset with "only" my 101 observations, but with a lot more of data which I don't know how to handle. The only thing I want is to get my dataset complete to calculate households' expenditures . As I am working with different files where I store other data for the same households, should I pay attention to anything specific while handling my data?
Thank you very much in advance for any help.
Best regards, Gabriel
local costs_1m R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R16 R18 R20
di "`costs_1m"
mi set mlong
mi set M=20
foreach x of local costs_1m {
gen im_T3_2_1_`x'_B8 = T3_2_1_`x'_B8
mi register imputed im_T3_2_1_`x'_B8
mi register regular T1_A1 agr_nonagr_offfarm_inc
mi impute mvn im_T3_2_1_`x'_B8 = T1_A1 agr_nonagr_offfarm_inc if T3_2_1_`x'_B7 == 1, add(20) rseed (3456)
label var im_T3_2_1_`x'_B8 "Costs - with imputed values"
order im_T3_2_1_`x'_B8, after (T3_2_1_`x'_B8)
}
foreach x of local costs_1m {
gen T3_2_1_`x'_B9B = 12*im_T3_2_1_`x'_B8 if im_T3_2_1_`x'_B8 > 0
label var T3_2_1_`x'_B9B "Annual expenditure"
order T3_2_1_`x'_B9B, after (T3_2_1_`x'_B9A)
}
Related Posts with Difficulty in handling data using multiple imputation
Ivregress 2sls error "r(301)" last estimates not found.Hi everyone, I am using portable Stata 16. I need to run 2sls regression of y on x with instrument …
Issues using esttabHi All, I'm now trying to use esttab for generating the table: Could you please explain why I don't …
Kitagawa-Oaxaca-Blinder Panel Data Model - Interventionist Approach / Oaxaca Panel Suggestions?Does anyone have any advice on implementing Kroger and Hartmann's (August 25, 2021) interventionist …
After frlink, can we check which observation in frame2 didn't matach in frame1?After frlink, can we check which observation in frame2 didn't match in frame1? If we use two dataset…
Referring to the last Variable in A DatasetI'm working with Johns' Hopkins COVID-19 data. It comes in wide format and I must reshape it. The is…
Subscribe to:
Post Comments (Atom)
0 Response to Difficulty in handling data using multiple imputation
Post a Comment