I'm quite new to working with Stata and therefore desperately looking for help! I have a dataset consisting of >200 firms and different characteristics of these firms such as their industry affiliation (see example below). However, each firm has multiple industry group affiliations. My goal is to cluster these firms based on the similarity of industry group affiliation and to create a new categorical variable consisting of those 3 clusters. Has anyone experience with this kind of problem or can help me on how to ideally approach this? Thank you so much in advance!!
Data:
firm_id | industry_groups |
1 | Advertising, Commerce and Shopping, Sales and Marketing |
2 | Advertising, Media and Entertainment, Mobile, Sales and Marketing, Software |
3 | Energy, Natural Resources, Sustainability |
... | ... |
0 Response to Cluster based on string similarity
Post a Comment