Hi all,
I would like to merge together 16 datasets which are defined by country and are made as follows:
AUS.dta is:
var1
Mol1_a a1
Mol2_a a2
Mol3_a a3
Mol4_a a4
Mol5_a a5
...
Molk_a ak
GER.dta is:
var1
Mol1_g g1
Mol2_g g2
Mol3_g g3
Mol4_g g4
Mol5_g g5
...
Molk_g gk
where Moli_g is different from Moli_a. Both are string variables. And so on for other 14 countries.dta.
Now what I would like to do (possibly in a loop) is to merge together the datasets so that if a value (molecule) is missing in a country the data is maintained only in the country(its) where var1 is not missing and in the others it is put as missing.
For instance say Mol2_a and Mol5_a are present in AUS.dta but not in GER.dta and at the same time Mol1_g in GER.dta is not present in AUS.dta while other values are in common in the sense that Mol3_a coincides with Mol6_g for instance and Mol4_a with Mol5_g (and other with others molecules in GER.dta e.g. Mol6_a with g23 and so on)The resulting db should look like this:
GER AUS
Moll1_g g1 .
Moll1_a g2 .
Moll2_g g2 .
Moll2_a . a2
Moll3_g g3 a7
Moll3_a g6 a3
Moll4_g g2 .
Moll4_a g5 a4
Moll5_g g5 a8
Moll5_a . g5
Moll6_g g6 a3
Moll6_a a2 g23
Then I'll drop the duplicates...
0 Response to merging datasets with different dimensions
Post a Comment