Hello!

I really had a headache with merge. I have a dataset with the following relevant variables: companyname, year and industry. What I want to achieve is sth like the screenshot below, except for this example only merge by industry but I want to merge by industry and year. I want to know, for example, which companies are in the same industry as company A every year, as company A may exist in the sample for a few years.

Array

My actual data has so many observations so I will give a short hypothetical example to make things easier.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str1 companyname int year str1 industry
"1" 2019 "A"
"2" 2019 "B"
"3" 2019 "A"
"4" 2019 "C"
"1" 2020 "A"
"2" 2020 "B"
"3" 2020 "A"
"5" 2020 "C"
"2" 2021 "B"
"4" 2021 "C"
"6" 2021 "C"
"7" 2021 "B"
end
I tried:
Code:
use data ,clear
duplicates drop companyname,force
isid companyname

merge 1:m industry year using data.dta
There are some observations dropped after duplicates drop, and I use isid to see whether companyname uniquely identify the observations, and the answer is yes. But when I try to merge, stata says "variable industry year do not uniquely identify observations in the master data". So I wonder what's wrong here...

Any help is appreciated! Thanks a lot in advance!