Dear all,

I would like to compare educational institutions of two workers and code a binary variable that indicates whether these two workers have attended the same university (regardless of the type of degree, time etc.).

The structure of the dataset is shown below. There are pairs of a worker (worker_id) and coworker (coworker_id) and we have the names of the institutions he/she attended for their Bachelors, Masters etc. degree. The variables are just an excerpt, in the full dataset there is a range of further variables such as for MBA university, PhD university etc. which are not displayed here for brevity. The names of the institutions are standardized (i.e. there are no different spellings for the same institution), therefore I would like to do an exact comparison rather than similarity scoring.

The goals is to compare all the education variables of a worker with those of the coworker, and if they attended the same university at some point, then the (new, to be generated) binary variable "same_university" would take the value 1. For example, in line 3 below, worker_id 44 attended the University of Puget Sound and coworker_id 1 as well, thus binary variable "same_university" would take the value 1. Similarly, in line 5 in the example below, both workers attended University of Puget Sound (it is irrelevant that they obtained different types of degrees or may not have been there at the same time).

How would you recommend to code this?

Thanks a lot for your help again!


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(worker_id coworker_id) str33 worker_ba_uni_std str26 worker_ma_uni_std str25 coworker_ba_uni_std byte coworker_ma_uni_std
72 1 "hobart and william smith colleges" "dartmouth college"          "university of puget sound" .
27 1 "stanford university"               "stanford university"        "university of puget sound" .
44 1 "university of puget sound"          ""                           "university of puget sound" .
28 1 "city university of new york"       ""                           "university of puget sound" .
 5 1 "indian institute of technology"    "university of puget sound" "university of puget sound" .
17 1 "stanford university"               ""                           "university of puget sound" .
15 1 "stanford university"               ""                           "university of puget sound" .
72 1 "hobart and william smith colleges" "dartmouth college"          "university of puget sound" .
19 1 "dartmouth college"                 ""                           "university of puget sound" .
end