Hello,

In order to obtain repeated cross-sectional data, I have appended 3 datasets stemming from different rounds of a household survey in Kenya (survey years 2006, 2009, 2015).
Each of the 3 surveys comes with a variable that specifies the district of the interviewed household, however the number of unique districts increases with every survey year.

Now I would like to generate a new variable that gives me the district for all observations in the dataset, i.e. combine district2006, district2009 and district2015 into one variable. However, I would like its values to be generated as missing if the district of the household was not included in the 2006 survey:

District2006 ={A,B}, District2009={A,B,C}, District2015={A,B,C,D,E}
-> District060915= {A,B}

I hope the following table makes it a bit more clear:
ID District2006 District2009 District2015 *District060915*
1-06 A . . A
2-06 B. . . B
1-09 A . A
2-09 . B . B
3-09 . C . .
1-15 A A
2-15 . B B
3-15 C .
4-15 D .
5-15 E .















Is there an efficient way to do this? Or do I have to manually compare the three columns to create the 4th? Thank you very much in advance!

Best,
Sophia Magis