I am working with some data on children (siblings) from various households from an Indian panel survey (IHDS).
I have 1 dataset with 7,862 obs. These observations are on an individual-level and have on average 2-4 children per household on the basis on some conditions I had set to create the sample I am interested in investigating.
I also have a dataset with 3,070 obs. on the Head (parent or grandparent) of the specific households that the children belong from in dataset-1.
I am interested in using the Household head's information (specifically income and education variables) and include it for the respective children in dataset-1.
I am struggling with doing so because dataset-1 with children has more than one value per household (due to siblings) but I have only 1 observation for the Household Head for each Household.
How can I create repeated values of the Head's income and education variable for the children from his/her household?
Below is an example of the dataset-1 (the children from households).
The first column shows their unique-id.
The second column shows their Household id (HHBASE). (As you may notice they repeat with the number od siblings)
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str12 id double HHBASE "10102010205" 1010201020 "10102010206" 1010201020 "10102010207" 1010201020 "10102010304" 1010201030 "10102010305" 1010201030 "10102010306" 1010201030 "10102010307" 1010201030 "10102011304" 1010201130 "10102011305" 1010201130 "10102011306" 1010201130 "10102011307" 1010201130 "10102011702" 1010201170 "10102011703" 1010201170 "10102011704" 1010201170 "10102011705" 1010201170 "10102012007" 1010201200 "10102020103" 1010202010 "10102020104" 1010202010 "10102020105" 1010202010 "10102020403" 1010202040 "10102020405" 1010202040 "10102020505" 1010202050 "10102021205" 1010202120 "10102021206" 1010202120 "10102030105" 1010203010 "10102030106" 1010203010 "10102030107" 1010203010 "10102030207" 1010203020 "10102030208" 1010203020 end
Below is an example of dataset-2 (the household head for the same households from dataset-1)
Just like dataset-1, the first column shows their unique-id.
The second column shows their houshold id (HHBASE). (As you may notice the difference here from dataset-1, each household one has 1 observation)
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str12 id double HHBASE "10102010201" 1010201020 "10102010301" 1010201030 "10102011301" 1010201130 "10102011707" 1010201170 "10102012001" 1010201200 "10102020101" 1010202010 "10102020401" 1010202040 "10102020502" 1010202050 "10102021203" 1010202120 "10102030101" 1010203010 "10102030201" 1010203020 "10102030401" 1010203040 "10102030501" 1010203050 "10102030601" 1010203060 "10102030701" 1010203070 "10102031304" 1010203130 "10102040301" 1010204030 "10102040601" 1010204060 "10102040901" 1010204090 end
How can I make my final dataset look something like the following: (manually typed Example)
(Child's) id (DS1) |
(Child's) HHBASE (id) |
(HEAD'S) INCOME (id) |
(HEAD'S) EDUCATION (id) |
10102010205 | 1010201020 | A | A |
10102010206 | 1010201020 | A | A |
10102010305 | 1010201030 | B | B |
10102010306 | 1010201030 | B | B |
10102010307 | 1010201030 | B | B |
10102011702 | 1010201170 | C | C |
10102011703 | 1010201170 | C | C |
Your input is highly appreciated!
Thank you in advance!
0 Response to How to make duplications/ repetitions of observations
Post a Comment