Hi all!

I have a question regarding how to calculate average distance between inventors per patent. My research is directed towards matching this average distance to team size per patent. I have figured to calculate the latter quite easily, however did not manage to create a working code to create my average distance variable. I am still novice in using stata

Info on the data:
# obs: 11,7960,560
# variables: 5
Downloaded from: https://figshare.com/collections/Pat...Data/3458001/1

The data is structured as following:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str12 patent double(longtitude latitude) float(N N2)
"3858241"  -71.52723  42.15059  2  1
"3858241"  -71.56462  42.39138  2  2
"3858242"  -83.94623  42.45551  1  1
"3858243"     4.9587  45.77077  3  1
"3858243"    6.44971  48.17573  3  2
"3858243"    6.44971  48.17573  3  3
"3858244"  -71.98106   41.9545  1  1
"3858245"  -74.00714  40.71455  2  1
"3858245"  -74.00714  40.71455  2  2
"3858246"  -97.74299  30.26759  1  1
"3858247" -118.24532  34.05349  1  1
"3858248"  -86.06411  39.48382  1  1
"3858249"  -98.49461  29.42458  1  1
"3858250"  -74.07528  40.64243  1  1
"3858251"  -87.71375  43.74978  6  1
"3858251"  -87.71375  43.74978  6  2
"3858251"  -87.71375  43.74978  6  3
"3858251"  -87.71375  43.74978  6  4
"3858251"  -87.71375  43.74978  6  5
"3858251"  -87.80372  43.99668  6  6
"3858252"  -118.1924  33.76673  1  1
"3858253"  -73.55469  45.51241  1  1
"3858254" -121.71051  39.41422  1  1
"3858255"  -86.14996  39.76691  1  1
"3858256"  -74.00714  40.71455  1  1
"3858257"  -73.45359  41.39268  1  1
"3858258" -118.41087   33.8872  1  1
"3858259"  -77.44223  41.13766  2  1
"3858259"  -78.77968  42.96212  2  2
"3858260"   -90.4899  41.49173  1  1
"3858261" 132.811447   33.7644  7  1
"3858261" 132.565781 34.256401  7  2
"3858261" 138.860886 35.119579  7  3
"3858261" 138.474854 35.600948  7  4
"3858261" 139.707489 35.615528  7  5
"3858261" 140.203751  35.73119  7  6
"3858261" 139.328506 36.230419  7  7
"3858262"  -70.89581   42.5224  1  1
"3858263"   20.92642 51.459499 19  1
"3858263"   -4.40377  55.63241 19  2
"3858263"   -4.40377  55.63241 19  3
"3858263"   -4.40377  55.63241 19  4
"3858263"   -4.40377  55.63241 19  5
"3858263"   -4.40377  55.63241 19  6
"3858263"   -4.40377  55.63241 19  7
"3858263"  26.132271 59.141079 19  8
"3858263"  26.132271 59.141079 19  9
"3858263"  26.132271 59.141079 19 10
"3858263"  26.132271 59.141079 19 11
"3858263"  26.132271 59.141079 19 12
"3858263"  26.132271 59.141079 19 13
"3858263"  26.132271 59.141079 19 14
"3858263"  26.132271 59.141079 19 15
"3858263"  26.132271 59.141079 19 16
"3858263"  26.132271 59.141079 19 17
"3858263"  26.132271 59.141079 19 18
"3858263"  26.132271 59.141079 19 19
"3858264"  -88.08226  43.04624  2  1
"3858264"  -88.12636    43.056  2  2
"3858265"  -71.28109  41.94509  1  1
"3858266"  -84.55103  41.47396  1  1
"3858267"  -78.87847  42.88544  1  1
"3858268"  -73.96402  42.82682  1  1
"3858269"    -83.048  42.33168  3  1
"3858269"  -82.93755  42.37577  3  2
"3858269"  -83.37833  42.46548  3  3
"3858270"   -85.7664  38.25486  1  1
"3858271"   -1.59103  52.28176  2  1
"3858271"   -1.59103  52.28176  2  2
"3858272"  -86.24593  43.23424  2  1
"3858272"  -86.24593  43.23424  2  2
"3858273"  -71.81567   41.3436  2  1
"3858273"  -71.43525  41.77884  2  2
"3858274"  -81.86294  41.01013  2  1
"3858274"  -81.86402  41.14032  2  2
"3858275"  -97.95754  34.50187  1  1
"3858276"  -82.40011  34.84827  1  1
"3858277"  -89.97346  29.94148  2  1
"3858277"  -90.07775   29.9537  2  2
"3858278"   11.49517  48.71687  2  1
"3858278"   11.42523  48.76237  2  2
"3858279"   13.18759  55.70623  5  1
"3858279"   13.20034 55.706841  5  2
"3858279"   17.55723  59.99289  5  3
"3858279"   16.30952  60.15544  5  4
"3858279"   20.89238  64.74951  5  5
"3858280"  -70.92851  42.52636  1  1
"3858281" -122.34271  37.93781  1  1
"3858282" -118.44434  34.15107  1  1
"3858283"  -80.15144  41.63649  1  1
"3858284"  -73.92937  41.70672  3  1
"3858284"   -74.1183  42.04194  3  2
"3858284"   -74.1183  42.04194  3  3
"3858285"    -.26171  52.08698  1  1
"3858286"    -.30695  51.39207  1  1
"3858287"   12.45425  55.68058  1  1
"3858288"  -85.77702   39.5212  2  1
"3858288"  -83.74847   42.2821  2  2
"3858289"  -83.14245  42.48816  3  1
"3858289"   -82.8774   42.5966  3  2
end
Thus, I have a variable containing a patent identification number, the latitude and longitude per inventor and created counting variables showing the sum of inventors per patent and counting upwards the number of inventors per patent. My first aim is to generate variables constituting distance between all inventor nodes within one patent. Thereafter, I would like to take the average of all these generated distances.

Till now I have attempted producing something similar as stated in the following old thread: https://www.stata.com/statalist/arch.../msg00657.html
However, did not achieve to reproduce the code to work for my particular data set. Additionally, I attempted to reshape my data to be able to use geodist to compute distance, however lack of computational power (about 8gb ram, Ryzen 5 processor) does not allow me to work such code.

Could anyone help me understand how I could manage to get this working? Any aid would be appreciated!