I have firm-level panel data (merged with patent-level data) with geographical information such as longitude/latitude/county/city/state etc.
With this, I would like to generate clusters or groups, for example, group A that is clustered by all the firms within 1 or 10 mile(s) from reference location.
However, I do not have clear ideas on:
1) how to decide the reference location; and
2) how to generate this kind of cluster by stata or which command should I use (regarding Ripley's spatial K-function of Space-time K-function).
For more information, following is part of my dataset (since my datas are confidential, this is a brief example):
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int gvkey double year long patent_num str100 city str8 state_code str33 county float(latitude longitude tag) 0 1950 123456 "Charlotte" "NC" "Mecklenburg" 35.410557 -80.84306 1 0 1950 123456 "Charlotte" "NC" "Mecklenburg" 35.410557 -80.84306 1 0 1950 123456 "Charlotte" "NC" "Mecklenburg" 35.410557 -80.84306 1 0 1950 123456 "Charlotte" "NC" "Mecklenburg" 35.410557 -80.84306 1 0 1950 123456 "Charlotte" "NC" "Mecklenburg" 35.410557 -80.84306 1 1 1950 123456 "Maumee" "OH" "Lucas" 41.70798 -83.70683 1 1 1950 123456 "Naperville" "IL" "DuPage" 41.79007 -88.20559 1 1 1950 123456 "Maumee" "OH" "Lucas" 41.70798 -83.70683 1 1 1950 123456 "Naperville" "IL" "DuPage" 41.79007 -88.20559 1 1 1950 123456 "Naperville" "IL" "DuPage" 41.79007 -88.20559 1 2 1950 123456 "Waukegan" "IL" "Lake" 42.16139 -88.13834 1 2 1950 123456 "Chicago" "IL" "Cook" 41.65819 -87.67947 1 2 1950 123456 "Naugatuck" "CT" "New Haven" 41.28052 -72.874146 1 2 1950 123456 "Waukegan" "IL" "Lake" 42.16139 -88.13834 1 2 1950 123456 "Downers Grove" "IL" "DuPage" 41.81107 -88.02453 1 3 1950 123456 "Irving" "TX" "Dallas" 32.9027 -96.5636 1 3 1950 123456 "Irving" "TX" "Dallas" 32.9027 -96.5636 1 3 1950 123456 "Irving" "TX" "Dallas" 32.9027 -96.5636 1 3 1950 123456 "Irving" "TX" "Dallas" 32.9027 -96.5636 1 3 1950 123456 "Irving" "TX" "Dallas" 32.9027 -96.5636 1 4 1950 123456 "Memphis" "TN" "Shelby" 35.0337 -89.9343 1 4 1950 123456 "Blue Bell" "PA" "Montgomery" 40.3128 -75.32134 1 4 1950 123456 "Blue Bell" "PA" "Montgomery" 40.3128 -75.32134 1 4 1950 123456 "Blue Bell" "PA" "Montgomery" 40.3128 -75.32134 1 4 1950 123456 "Memphis" "TN" "Shelby" 35.0337 -89.9343 1 5 1950 123456 "Dayton" "OH" "Montgomery" 39.84139 -84.41647 1 5 1950 123456 "New York" "NY" "New York" 40.74838 -73.996704 1 5 1950 123456 "Schaumburg" "IL" "Cook" 41.65819 -87.67947 1 5 1950 123456 "Chicago" "IL" "Cook" 41.65819 -87.67947 1 5 1950 123456 "La Jolla" "CA" "San Diego" 33.02935 -116.85355 1 6 1950 123456 "Stamford" "CT" "Fairfield" 41.25555 -73.43528 1 6 1950 123456 "Stamford" "CT" "Fairfield" 41.25555 -73.43528 1 6 1950 123456 "Stamford" "CT" "Fairfield" 41.25555 -73.43528 1 6 1950 123456 "Palo Alto" "CA" "Santa Clara" 37.444324 -122.1497 1 6 1950 123456 "Racine" "WI" "Racine" 42.69673 -87.90308 1 7 1950 123456 "Hoffman Estates" "IL" "Cook" 41.65819 -87.67947 1 7 1950 123456 "Cleveland" "OH" "Cuyahoga" 41.45346 -81.92177 1 7 1950 123456 "Pittsburgh" "PA" "Allegheny" 40.59417 -79.97028 1 7 1950 123456 "Denver" "CO" "Denver" 39.7507 -104.989 1 7 1951 123456 "Hoffman Estates" "IL" "Cook" 41.65819 -87.67947 1 8 1950 123456 "Culver City" "CA" "Los Angeles" 34.26187 -118.45866 1 8 1950 123456 "Chicago" "IL" "Cook" 41.65819 -87.67947 1 8 1950 123456 "Harrisburg" "PA" "Dauphin" 40.2782 -76.70937 1 8 1950 123456 "Houston" "TX" "Harris" 30.00409 -95.28248 1 8 1950 123456 "Chicago" "IL" "Cook" 41.65819 -87.67947 1 9 1950 123456 "Houston" "TX" "Harris" 30.00409 -95.28248 1 9 1950 123456 "Austin" "MN" "Freeborn" 43.75284 -93.16799 1 9 1950 123456 "Houston" "TX" "Harris" 30.00409 -95.28248 1 9 1950 123456 "Houston" "TX" "Harris" 30.00409 -95.28248 1 9 1950 123456 "Houston" "TX" "Harris" 30.00409 -95.28248 1 10 1950 123456 "Philadelphia" "PA" "Philadelphia" 40.1162 -75.0141 1 10 1950 123456 "Philadelphia" "PA" "Philadelphia" 40.1162 -75.0141 1 10 1950 123456 "Philadelphia" "PA" "Philadelphia" 40.1162 -75.0141 1 10 1950 123456 "Philadelphia" "PA" "Philadelphia" 40.1162 -75.0141 1 10 1950 123456 "Philadelphia" "PA" "Philadelphia" 40.1162 -75.0141 1 11 1950 123456 "" "TX" "" . . 1 11 1950 123456 "Downers Grove" "IL" "DuPage" 41.81107 -88.02453 1 11 1950 123456 "Buffalo" "NY" "Erie" 42.834 -78.63425 1 11 1950 123456 "Oak Brook" "IL" "DuPage" 41.8364 -87.95317 1 11 1950 123456 "White Pine" "MI" "Ontonagon" 46.73806 -89.17944 1 12 1950 123456 "Brentwood" "TN" "Davidson" 36.260387 -86.70456 1 12 1950 123456 "Pittsburgh" "PA" "Allegheny" 40.59417 -79.97028 1 12 1951 123456 "Brentwood" "TN" "Davidson" 36.260387 -86.70456 1 12 1951 123456 "Pittsburgh" "PA" "Allegheny" 40.59417 -79.97028 1 12 1952 123456 "Pittsburgh" "PA" "Allegheny" 40.59417 -79.97028 1 13 1950 123456 "Parsippany" "NJ" "Morris" 40.8819 -74.62099 1 13 1950 123456 "Saint Paul" "MN" "Ramsey" 44.96996 -93.08317 1 13 1950 123456 "Clarks Summit" "PA" "Lackawanna" 41.34319 -75.530136 1 13 1950 123456 "" "" "" . . 1 13 1951 123456 "Clarks Summit" "PA" "Lackawanna" 41.3731 -75.6841 1 14 1950 123456 "Richmond" "VA" "Richmond city" . . 1 14 1950 123456 "Atlanta" "GA" "DeKalb" 33.888504 -84.28954 1 14 1950 123456 "Atlanta" "GA" "Fulton" 34.040833 -84.3859 1 14 1950 123456 "Carteret" "NJ" "Middlesex" 40.33243 -74.56883 1 14 1950 123456 "Richmond" "VA" "Richmond city" . . 1 15 1950 123456 "" "" "" . . 1 15 1950 123456 "" "" "" . . 1 15 1950 123456 "" "" "" . . 1 15 1950 123456 "Dayton" "OH" "Warren" 39.37145 -84.21078 1 15 1950 123456 "" "" "" . . 1 16 1950 123456 "New Britain" "CT" "Hartford" 41.92608 -72.64576 1 16 1950 123456 "New York" "NY" "New York" 40.74838 -73.996704 1 16 1950 123456 "Columbus" "OH" "Franklin" 40.03219 -83.13834 1 16 1950 123456 "New York" "NY" "New York" 40.74838 -73.996704 1 16 1950 123456 "New York" "NY" "New York" 40.74838 -73.996704 1 17 1950 123456 "Houston" "TX" "Harris" 30.00409 -95.28248 1 17 1950 123456 "Houston" "TX" "Harris" 30.00409 -95.28248 1 17 1950 123456 "Philadelphia" "PA" "Philadelphia" 40.1162 -75.0141 1 17 1950 123456 "Houston" "TX" "Harris" 30.00409 -95.28248 1 17 1950 123456 "Philadelphia" "PA" "Philadelphia" 40.1162 -75.0141 1 18 1950 123456 "Upland" "IN" "Grant" 40.5112 -85.82655 1 18 1950 123456 "El Segundo" "CA" "Los Angeles" 34.26187 -118.45866 1 18 1950 123456 "El Segundo" "CA" "Los Angeles" 34.26187 -118.45866 1 18 1950 123456 "Mckeesport" "PA" "Allegheny" 40.59417 -79.97028 1 18 1950 123456 "Oakland" "CA" "" . . 1 19 1950 123456 "Yardley" "PA" "Bucks" 40.57824 -75.21906 1 19 1950 123456 "Yardley" "PA" "Bucks" 40.57824 -75.21906 1 19 1950 123456 "Yardley" "PA" "Bucks" 40.57824 -75.21906 1 19 1950 123456 "Yardley" "PA" "Bucks" 40.57824 -75.21906 1 19 1950 123456 "Cupertino" "CA" "Santa Clara" 37.444324 -122.1497 1 end
I have tried to search some related literatures on this methods, however, I could not have yet found the clear solutions and codes that I could use.
Sorry if my questions were unclear.
Could someone help me with these issues, please ?
Thank you very much in advance,
A.-C
0 Response to Ripley's spatial K-function: Generating clusters based on geographical information in firm-level & patent-level panel data
Post a Comment