Hello,

I find it difficult to put my problem into a simple heading so the name of the thread might be misleading. Anyhow, I hope I can put my problem into understandable words below.

I have a dta file with various variables including Company (string), NewName (string), and Year (int). Each Company is observed multiple times over the time - sometimes a company has not been observed for a year but observed the year thereafter again and so on. Whenever a company changes its name (that unfortunately happens quite often), I have an observation for the respective year with the old company name in Company and the new company name in NewName. Further, I have an observation of the same year with the new company name in Company.

Here is an example of my data (I manually added it a bit to include all relevant examples):


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int Year str56 Company str42 NewName
2014 "One Holding"    "1Time Holdings LTD"
2014 "1Time Holdings LTD"    ""
2016 "1Time Holdings LTD"                                       ""                              
2018 "1Time Holdings LTD"                                          ""                    
2019 "1Time Holdings LTD"                                       "Example Holding"        
2019 "Example Holding"                                       ""                              
2018 "4SIGHT HOLDINGS LTD"                                      ""                              
2019 "4SIGHT HOLDINGS LTD"                                      ""                              
2019 "ABSA GROUP LTD"                                           ""                              
2017 "ACCELERATE PROPERTY FUND LTD"                             ""                              
2018 "ACCELERATE PROPERTY FUND LTD"                             ""                              
2019 "ACCELERATE PROPERTY FUND LTD"                             ""                              
2017 "ACCENTUATE LTD"                                           ""                              
2018 "ACCENTUATE LTD"                                           ""                              
2019 "ACCENTUATE LTD"                                           ""                              
2017 "ACSION LTD"                                               ""                              
2018 "ACSION LTD"                                               ""                              
2019 "ACSION LTD"                                               ""                              
2017 "ACUCAP PROPERTIES LTD"                                    ""                              
2017 "ADAPTIT HOLDINGS LTD"                                     ""                              
2018 "ADAPTIT HOLDINGS LTD"                                     ""                              
2019 "ADAPTIT HOLDINGS LTD"                                     ""                              
end
Now, I want a new variable ("AllNames") that includes all Company Names that have been used for this company (I want to use the variable to create an ID variable later for example). For example for One Holding which became 1Time Holdings LTD in 2014 which became Example Holding in 2019 the variable should be "One Holding, 1Time Holdings LTD, Example Holding".

I was able to partly generate such a variable for observation containing the old and the new name by:

Code:
gen AllNames=""
replace AllNames =Company + ", " + NewName if NewName!=""
Unfortunately this is only the correct AllNames variable if the company only changed its name once. I then tried to fill the AllNames variable for the previous observations by using a panel data structure, namely:

Code:
gen NewYear=-Year
xtset Company NewYear
sort Company NewYear
by Company: gen lag_AllNames=AllNames[_n-1]
replace AllNames=lag_AllNames if AllNames==""
Unfortunately this code only works if there are no gaps in the observations of a company. I might be able to solve the problem by taking lag 2 or lag 3 instead of lag 1 in the code but this approach does not work at all for the observation that carry the new company name in the variable Company. I guess, there is a much more straightword approach to the problem but I cannot think of one at the moment.

I hope you can help me. If you need further information on the problem, please let me know.

Best regards
Nina