Dear Statlisters,

I am trying to calculate a pairwise Jaccard similarity measure and have trouble figuring out how to do so. My data is in the following format: the first variable, assignee_id represents the firm, and the other variables (law_1-5) represent their legal partners (dummy variables, a 1 indicating that they have worked with that firm). Now I am trying to calculate the pairwise similarity measure for firms depending on how similar they are in the use of their legal partners. I have been playing around with a few different things but haven't gotten anywhere, so your help with the syntax would be much appreciated. I've attached a data example below

Thanks for your help

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str32 assignee_id byte(law_1 law_2 law_3 law_4 law_5)
"00d92f99f43508d37de79da7051b43c7" 1 0 0 0 0
"00e5262f320cbda9f15490debbe80858" 0 1 0 1 0
"031b354668d5ceefc7b4bb3ba57664d4" 0 0 0 1 1
"03810188291c60318b5b0da566c266fb" 0 1 0 1 0
"054d563b447b317f56d940f5e3dd7b39" 1 0 0 0 0
"05695a60b69eb9a0f6e781debe23e9cc" 1 0 0 0 0
"062af6b4d9f7708cfd5e659cd13a3726" 1 0 1 0 0
"081507e638fca84980f88a3c3f5cd1fa" 0 0 0 0 0
"099c2e138f83bf0366539bddfda6b2e2" 0 0 1 0 0
"09fc005ad2872886a676a2f4197ce018" 0 0 0 0 0
"0a00649f54947198768fa954f8756563" 0 0 0 0 0
"0a21a0cbd50fe6558b13d773effc9eb1" 0 0 1 0 0
"0a302a7b505844998614e26c7c26d4a0" 0 1 0 1 1
"0a4642a77d52197c97f5d592966b68d7" 0 1 1 1 1
"0a74e8eea755f3ab33162a52dc87bb5d" 1 0 0 0 0
"0bb9626cc72bbfaf9ae174a022ceb086" 0 1 0 0 0
"0c65f80fcfe79b0c4732a7ebc645da8c" 0 1 0 0 0
"0ceb8b624ea012dea6d0c3705d4f547e" 0 1 1 0 0
"0d5c37ddbc9800bfc84774afe4b36faa" 1 0 0 0 1
"0d5fb33b90b1825b0003a1573d7477fe" 0 0 0 0 0
"0d6c6c25cf34819e50fd97318db9b699" 0 1 0 0 0
"0ee26da954c6572b783432f619a301e3" 1 0 0 0 0
"0f4a6ddb6c4a854440e1123924820706" 0 0 0 0 1
"0fa5a08e051f6bb467854f4bbb913a46" 0 0 1 1 0
"1005528d1a3c548b2403fba94f0927f5" 1 0 0 1 0
"107da3bb737c53c0d39645f72ede8b86" 0 0 0 0 1
"10b108b4ee97bab2304d092590c0bf7c" 0 0 0 0 1
"11127e943b93352979514b124179eb94" 1 0 0 1 0
"11f00a94b4fe1138e00af83137db2fac" 0 0 1 0 0
"134d75dd2f4984f02db90d441336fd2e" 0 1 1 0 0
end