Hello everyone,

I am trying to merge two datasets containing observations on firms. Each dataset contains observations on firms across multiple years. The problem is that these datasets come from different databases and the ID from one dataset changes over time. For example: for a company, the ID (which is a code) is EP407011 from 2000 to 2007 and EPB1L2YG after 2007. In the second dataset, the ID is always EPB1L2YG.

Is there a way to merge changing the ID if there is a match at least in one year, using the ID for which there is a match for all years? So in our previous case: if there is a match after 2007 stata will need to change EP407011 to EPB1L2YG from 2000 to 2007.

I don't know if it could be easier but I don't strictly need the merge but only to attach a variable contained in the second dataset to the first. So alternatively, extend this variable (which is another identification code, thus a constant) from the year with a match to the year without a match.

I have no idea where to start to write such a code, and I would like to not lose all matches before 2007 (in the example).

Thank you a lot for your help,
Kind regards,
Gino