Dear Stata-community,
I am working on a dataset with information on household expenditures. My dataset is small, with 101 observations, and some variables have missing information, at random. I do have information on whether the household had a specific expenditure under variable T3_2_1_`x'_B7 == 1, with 1 standing for "yes". Information was collected on expenditures per month.
For some of the expenditure items there is missing data, which I am trying to impute depending on the household size and household income (T1_A1 and agr_nonagr_offfarm_inc, respectively). My ultimate objective is to get the complete data to be able to calculate expenditure per year, so the final result need to be multiplied by 12.
I first created a local macro with the variables for which I have missing information ("costs_1m"). Then I set up the mi commands in a loop. The Stata prompt showed that the values have been imputed (according to my filter T3_2_1_`x'_B7 == 1). I created a new variable because I didn't want to mess with my original data (which of course I have additionally saved elsewhere). Right after, I used another loop to make the calculation for the annual expenditures for each of the variables which had data imputed.
So far for what I did. My basic question is how to proceed now, because I don't have my dataset with "only" my 101 observations, but with a lot more of data which I don't know how to handle. The only thing I want is to get my dataset complete to calculate households' expenditures . As I am working with different files where I store other data for the same households, should I pay attention to anything specific while handling my data?
Thank you very much in advance for any help.
Best regards, Gabriel
local costs_1m R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R16 R18 R20
di "`costs_1m"
mi set mlong
mi set M=20
foreach x of local costs_1m {
gen im_T3_2_1_`x'_B8 = T3_2_1_`x'_B8
mi register imputed im_T3_2_1_`x'_B8
mi register regular T1_A1 agr_nonagr_offfarm_inc
mi impute mvn im_T3_2_1_`x'_B8 = T1_A1 agr_nonagr_offfarm_inc if T3_2_1_`x'_B7 == 1, add(20) rseed (3456)
label var im_T3_2_1_`x'_B8 "Costs - with imputed values"
order im_T3_2_1_`x'_B8, after (T3_2_1_`x'_B8)
}
foreach x of local costs_1m {
gen T3_2_1_`x'_B9B = 12*im_T3_2_1_`x'_B8 if im_T3_2_1_`x'_B8 > 0
label var T3_2_1_`x'_B9B "Annual expenditure"
order T3_2_1_`x'_B9B, after (T3_2_1_`x'_B9A)
}
Related Posts with Difficulty in handling data using multiple imputation
Copying a value into a missing value spot with specified conditionsDear Stata-users, I am struggling with a problem for some days now. I tried different methods but s…
Using estpost and esttab for mutli-level data summary statisticsHello, I want to complete summary statistics for my multilevel data - countries (level 2) households…
Fixed effects with multiple intersect categoriesHi there, I have a problem with a fixed effect regression in stata and no idea how to solve it. I h…
Looping over files with slightly different names in different foldersHello, I previously posted code in a thread, and with the help of a couple Statalist users came up …
Data Shaping for Panel analysisDear Stata users, I would like to shape data I add a time variable. My data are distance between co…
Subscribe to:
Post Comments (Atom)
0 Response to Difficulty in handling data using multiple imputation
Post a Comment