I have two data sets from 2001 to 2005 and 2006 to 2010, and I want to compare performances of people across 2 time periods.

Data set A
Name Dummy Variable X Y Z
A 1 12 32 87
B 0 32 43 47
A 0 17 46 57
C 1 23 45 54
Data set B
Name X Y
A 76 86
C 45 45
B 34 89
A 76 34
C 43 54

X is the dependent variable.
Would the right way to proceed would be to collapse data set A to names and (max) Dummy Variable, so that names don't get repeated and have consistent dummy variable and then merge m :1 to Data set B?