methodological question: matching/imputation based on two datasets

Hello,

I struggle to find the right method for what I want to do using two household surveys. I have two datasets:
1) X dataset with socio-econ info (A1) and Z info
2) Y dataset with socio-econ info (A2)

The Y dataset does not have Z info and this is what I want to impute based on the X dataset. The imputation/matching will be based on socio-econ info (A1 and A2). Which method is the best? I looked into MI with MAR options where they use mixed-method multiple imputations but this method is based on the fact that you impute missing values from the SAME population. I'm not so sure if I can use this method with my data.

If my example is too abstract then consider this: I have two household survey datasets. X has expenditures on food, clothing, and house fuels but Y dataset does not have it so I need to impute this information. This I can do because I have information related to income, household size, appliances ownership, etc in both datasets. So if the marginal distribution in both datasets X and Y is similar for these socio-econ characteristics I can then impute the expenditure data.

I would greatly appreciate any help - even naming method or tools that are available in STATA will be super helpful!

Cheers,

Marta

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / methodological question: matching/imputation based on two datasets
methodological question: matching/imputation based on two datasets

0 Response to methodological question: matching/imputation based on two datasets

Post a Comment

Home / Data Cleaning / Data management / Data Processing / methodological question: matching/imputation based on two datasets methodological question: matching/imputation based on two datasets

0 Response to methodological question: matching/imputation based on two datasets

Post a Comment

Home / Data Cleaning / Data management / Data Processing / methodological question: matching/imputation based on two datasets
methodological question: matching/imputation based on two datasets