Hi all,

I have the following two datasets. The first one looks like follows:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 atc3 float Year
"A10C" 2004
"A10C" 2005
"A10C" 2006
"A10C" 2007
"A10C" 2008
"A10C" 2009
"A10C" 2010
"A10C" 2011
"A10C" 2012
"A10C" 2013
"A10C" 2014
"A10C" 2015
"A10D" 2004
"A10D" 2005
"A10D" 2006
"A10D" 2007
"A10D" 2008
"A10D" 2014
"A10E" 2004
"A10E" 2005
"A10E" 2006
"A10E" 2007
"A10E" 2008
"A10E" 2009
"A10E" 2010
"A10E" 2011
"A10E" 2012
"A10E" 2013
"A10E" 2014
"A10E" 2015
"A10H" 2004
"A10H" 2005
"A10H" 2006
"A10H" 2007
"A10H" 2008
"A10H" 2009
"A10H" 2010
"A10H" 2011
"A10H" 2012
"A10H" 2013
"A10H" 2014
"A10H" 2015
"A10J" 2004
"A10J" 2005
"A10J" 2006
"A10J" 2007
"A10J" 2008
"A10J" 2009
"A10J" 2010
"A10J" 2011
"A10J" 2012
"A10J" 2013
"A10J" 2014
"A10J" 2015
"A10K" 2004
"A10K" 2005
"A10K" 2006
"A10K" 2007
"A10K" 2008
"A10K" 2009
"A10K" 2010
"A10K" 2011
"A10K" 2012
"A10K" 2013
"A10K" 2014
"A10K" 2015
"A10L" 2004
"A10L" 2005
"A10L" 2006
"A10L" 2007
"A10L" 2008
"A10L" 2009
"A10L" 2010
"A10L" 2011
"A10L" 2012
"A10L" 2013
"A10L" 2014
"A10L" 2015
"A10M" 2004
"A10M" 2005
"A10M" 2006
"A10M" 2007
"A10M" 2008
"A10M" 2009
"A10M" 2010
"A10M" 2011
"A10M" 2012
"A10M" 2013
"A10M" 2014
"A10M" 2015
"A10N" 2006
"A10N" 2007
"A10N" 2008
"A10N" 2009
"A10N" 2010
"A10N" 2011
"A10N" 2012
"A10N" 2013
"A10N" 2014
"A10N" 2015
end
The second one does not contain the information of the Year but just the information of the act3 name, i.e.:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 atc3
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
"A10C"
end

I would like to merge the two on the atc3 name either not knowing if all atc3 of database 2 are present also in database 1 and being aware that in both databases atc3 variable does not uniquely identify observations. Is there a way to do it? I do not think a merge m:m should go: indeed the problem is that in the first database the atc3 repeats n times according to the Year variable (i.e. is an unbalanced panel) while in the second database it repeats m times according to other variables not present in the first database.

Thanks,

Federico