Dear Stata-community,
I am working on a dataset with information on household expenditures. My dataset is small, with 101 observations, and some variables have missing information, at random. I do have information on whether the household had a specific expenditure under variable T3_2_1_`x'_B7 == 1, with 1 standing for "yes". Information was collected on expenditures per month.
For some of the expenditure items there is missing data, which I am trying to impute depending on the household size and household income (T1_A1 and agr_nonagr_offfarm_inc, respectively). My ultimate objective is to get the complete data to be able to calculate expenditure per year, so the final result need to be multiplied by 12.
I first created a local macro with the variables for which I have missing information ("costs_1m"). Then I set up the mi commands in a loop. The Stata prompt showed that the values have been imputed (according to my filter T3_2_1_`x'_B7 == 1). I created a new variable because I didn't want to mess with my original data (which of course I have additionally saved elsewhere). Right after, I used another loop to make the calculation for the annual expenditures for each of the variables which had data imputed.
So far for what I did. My basic question is how to proceed now, because I don't have my dataset with "only" my 101 observations, but with a lot more of data which I don't know how to handle. The only thing I want is to get my dataset complete to calculate households' expenditures . As I am working with different files where I store other data for the same households, should I pay attention to anything specific while handling my data?
Thank you very much in advance for any help.
Best regards, Gabriel
local costs_1m R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R16 R18 R20
di "`costs_1m"
mi set mlong
mi set M=20
foreach x of local costs_1m {
gen im_T3_2_1_`x'_B8 = T3_2_1_`x'_B8
mi register imputed im_T3_2_1_`x'_B8
mi register regular T1_A1 agr_nonagr_offfarm_inc
mi impute mvn im_T3_2_1_`x'_B8 = T1_A1 agr_nonagr_offfarm_inc if T3_2_1_`x'_B7 == 1, add(20) rseed (3456)
label var im_T3_2_1_`x'_B8 "Costs - with imputed values"
order im_T3_2_1_`x'_B8, after (T3_2_1_`x'_B8)
}
foreach x of local costs_1m {
gen T3_2_1_`x'_B9B = 12*im_T3_2_1_`x'_B8 if im_T3_2_1_`x'_B8 > 0
label var T3_2_1_`x'_B9B "Annual expenditure"
order T3_2_1_`x'_B9B, after (T3_2_1_`x'_B9A)
}
0 Response to Difficulty in handling data using multiple imputation
Post a Comment