My goal is matching firms from two datasets with minimum difference in their MarketCap using rangejoin command without duplication. Where low = MarketCap*0.7 and high=MarketCap*1.3, MarketCap_U - matching firms cap
CODE:
rangejoin MarketCap low high using "control80.dta"
gen delta = abs( MarketCap - MarketCap_U )
set seed 1234
gen double shuffle = runiform()
by perm (delta shuffle), sort: keep if _n == 1
DATA
----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long perm str22 name long cusip float MarketCap str28 companyname float(MarketCapM delta) 43597 "IMEX MEDICAL SYSTEMS" 45247510 2.00e+11 "ACME PRECISION PRODUCTS INC" 1.46125e+12 1.26125e+12 53963 "MIDWEST EXPLORATION" 59832310 203022745600 "ACME PRECISION PRODUCTS INC" 1.46125e+12 1.2582272e+12 10066 "ABM COMPUTER SYSTEMS" 77510 2.04e+11 "ACME PRECISION PRODUCTS INC" 1.46125e+12 1.25725e+12 44266 "INSITUFORM EAST" 45766210 2.0625e+11 "ACME PRECISION PRODUCTS INC" 1.46125e+12 1.255e+12 33567 "ESSEX" 29674410 2.00e+11 "ACME PRECISION PRODUCTS INC" 1.46125e+12 1.26125e+12 65330 "LEGG MASON" 52490110 3.031325e+12 "ADAMS DRUG INC" 2.133075e+13 1.8299425e+13 48267 "LSI LOGIC CORP." 50216110 1.848e+13 "ADAMS RESOURCES & ENERGY INC" 1.2945075e+14 1.1097075e+14 47731 "KINCAID FURNITURE" 49449010 1.84375e+12 "ADAMS-MILLIS CORP" 1.294875e+13 1.1105e+13 12424 "AMERICA WEST AIR" 2365010 1.84375e+12 "ADAMS-MILLIS CORP" 1.294875e+13 1.1105e+13 57323 "NEW HAMPSHIRE SAVINGS" 64467010 1.8493844e+12 "ADAMS-MILLIS CORP" 1.294875e+13 1.1099366e+13 49753 "LINEAR CORPORATION" 53566710 1.84375e+12 "ADAMS-MILLIS CORP" 1.294875e+13 1.1105e+13 32054 "ELECTRO SCIENTIFIC" 28522910 1.84875e+12 "ADAMS-MILLIS CORP" 1.294875e+13 1.11e+13 63378 "TECHAMERICA GROUP" 87831510 1.84375e+12 "ADAMS-MILLIS CORP" 1.294875e+13 1.1105e+13 29875 "DICEON ELECTRONICS" 25302610 3.312e+12 "AERO-FLOW DYNAMICS INC" 2.33185e+13 2.0006499e+13 68582 "MORGAN KEEGAN & CO." 61741010 1.98e+12 "AEROFLEX INC" 1.39995e+13 1.20195e+13 33102 "ENTRE COMPUTER" 29382510 1.992375e+12 "AEROFLEX INC" 1.39995e+13 1.2007125e+13 60135 "OVERLAND EXPRESS" 69022210 1.98e+12 "AEROFLEX INC" 1.39995e+13 1.20195e+13 14016 "AMHERST ASSOCIATES" 3118010 1.99045e+12 "AEROFLEX INC" 1.39995e+13 1.200905e+13 50359 "LYPHOMED, INC." 55233310 3.083025e+12 "AERONCA INC" 2.17155e+13 1.8632474e+13 11360 "AIR ONE, INC." 913810 3.075e+12 "AERONCA INC" 2.17155e+13 1.86405e+13 51450 "MARGAUX CONTROLS" 56660010 3.09375e+12 "AERONCA INC" 2.17155e+13 1.862175e+13 62026 "PENTA SYSTEMS" 70961810 1.4853875e+12 "AFFILIATED CAPITAL CORP" 1.0415625e+13 8.930237e+12 27706 "CRIME CONTROL" 22660810 7.43125e+11 "ALBA-WALDENSIAN INC" 5.204125e+12 4.461e+12 49905 "LIZ CLAIBORNE" 53932010 2.415e+12 "ALCOLAC INC" 1.69265e+13 1.45115e+13 79119 "U.S. HEALTH CARE" 91191010 3.1625e+12 "ALLIED PRODUCTS" 2.224625e+13 1.9083752e+13 27044 "CONVERSE, INC." 21253910 7.025e+12 "ALLIED TELEPHONE" 4.9203e+13 4.2178e+13 55651 "MUSTANG DRILLING" 62819010 6.1875e+11 "ALTEC CORP" 4.402664e+12 3.783914e+12 28362 "DAKOTA RESOURCES" 23426010 6.125e+11 "ALTEC CORP" 4.402664e+12 3.790164e+12 80929 "VIRATEK" 92764810 6.27e+11 "ALTEC CORP" 4.402664e+12 3.775664e+12 63124 "CRAWFORD ENERGY" 22501510 6.28125e+11 "ALTEC CORP" 4.402664e+12 3.774539e+12 64661 "TECHNICOM INT'L" 87854010 6.1275e+11 "ALTEC CORP" 4.402664e+12 3.789914e+12 18279 "BIOSTIM" 9091310 6.24375e+11 "ALTEC CORP" 4.402664e+12 3.778289e+12 33911 "FAFCO" 30239010 6.069e+11 "ALTEC CORP" 4.402664e+12 3.795764e+12 40898 "HAMMOND CO." 40835910 6.25e+11 "ALTEC CORP" 4.402664e+12 3.777664e+12 63386 "ULTIMATE CORPORATION" 90384810 6.249997e+11 "ALTEC CORP" 4.402664e+12 3.777664e+12 74230 "TCA CABLE TV" 87224110 6.15e+11 "ALTEC CORP" 4.402664e+12 3.787664e+12 66393 "RENAL SYSTEMS" 75991210 6.11875e+11 "ALTEC CORP" 4.402664e+12 3.790789e+12 24097 "CLINICAL DATA" 18725910 6.125e+11 "ALTEC CORP" 4.402664e+12 3.790164e+12 75767 "TEXAS VANGUARD" 88285310 6.1875e+11 "ALTEC CORP" 4.402664e+12 3.783914e+12 37429 "FOSTER MEDICAL" 35012410 3.01875e+12 "AMERICAN CENTURY CORP" 2.118188e+13 1.8163127e+13 72996 "STIFEL FINANCIAL" 86063010 1.9478744e+12 "AMERICAN CONTROLLED INDS" 1.36445e+13 1.1696625e+13 32620 "ENDOTRONICS, INC." 29264410 7.392e+11 "AMERICAN INDEPENDENCE CORP" 5.198375e+12 4.459175e+12 11835 "ALLEGHENY & WESTERN" 1722710 7.375e+11 "AMERICAN INDEPENDENCE CORP" 5.198375e+12 4.4608746e+12 27167 "COPYTELE, INC." 21772110 7.4175e+11 "AMERICAN INDEPENDENCE CORP" 5.198375e+12 4.4566247e+12 20918 "CAMBRIDGE BIOSCIENCE" 13215710 7.425e+11 "AMERICAN INDEPENDENCE CORP" 5.198375e+12 4.455875e+12 78626 "UNITED ED AND SFTWRE" 91020410 7.40e+11 "AMERICAN INDEPENDENCE CORP" 5.198375e+12 4.458375e+12 28418 "DALLAS FEDERAL S&L" 23503010 3.226956e+12 "AMERICAN MEDICAL BLDGS INC" 2.2607e+13 1.9380044e+13 40476 "H&H OIL TOOL" 40404010 1.817625e+12 "AMERICAN PRECISION INDS" 1.2864875e+13 1.104725e+13 43829 "INDEPENDENCE HEALTH" 45343610 1.41625e+12 "AMERICAN SEATING CO" 9.99025e+12 8.574e+12 62595 "PHIL SAVINGS FUND SOC" 71800210 3.441183e+13 "AMERIFIN CORP" 2.42228e+14 2.0781617e+14 65032 "QUALITY MICRO SYSTEMS" 74790710 2.28965e+12 "ANDAL CORP" 1.61225e+13 1.383285e+13 74203 "TBC CORPORATION" 87218010 2.3003063e+12 "ANDAL CORP" 1.61225e+13 1.3822194e+13 67889 "ENERGY OIL" 29291810 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 80582 "VERTX CORPORATION" 92533710 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 63888 "ASTRO DRILLING" 4590010 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 32804 "ENEX RESOURCES" 29274410 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 78554 "UNITED ENERGY TECH" 90990510 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 67396 "ROCKWELL DRILLING" 77434210 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 20862 "CALVIN EXPLORATION" 13165810 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 19677 "BUFFTON OIL & GAS" 11988510 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 31721 "EIKONIX" 28256010 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 72371 "STAN WEST MINING" 85285810 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 63159 "EXPLORATION SURVEYS" 30213510 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 57112 "NETWORD, INC." 64120510 5.25e+11 "ANDREA ELECTRONICS CORP" 3.683e+12 3.158e+12 83863 "XICOR" 98490310 1.71e+12 "ANIXTER INTL INC" 1.20e+13 1.029e+13 35482 "FIRST FED. FT. MYERS" 31991810 1.712525e+12 "ANIXTER INTL INC" 1.20e+13 1.0287475e+13 17065 "BASIC AMERICAN MED." 6983610 3.946635e+12 "APPLIED DATA RESEARCH INC" 2.764e+13 2.3693365e+13 44792 "INTERGRAPH" 45868310 3.9375e+12 "APPLIED DATA RESEARCH INC" 2.764e+13 2.37025e+13 16564 "BANCTEC" 5978410 1.64e+12 "ARMADA CORP" 1.152825e+13 9.88825e+12 35466 "FIRST DATA MANAGEMENT" 31991510 1.6379875e+12 "ARMADA CORP" 1.152825e+13 9.890262e+12 66667 "ODETICS" 67606510 1.09375e+12 "ARMATRON INTERNATIONAL INC" 7.659375e+12 6.565625e+12 40222 "GTECH CORPORATION" 40051610 2.46e+12 "ARROW AUTOMOTIVE INDUSTRIES" 1.72375e+13 1.47775e+13 64135 "WILLIAMS ELECTRONICS" 96990110 1.225e+12 "AUDIOTRONICS CORP" 8.586e+12 7.361e+12 55192 "MORRIS COUNTY S&L" 61803010 1.331409e+12 "AUDITS & SURVEYS WORLDWIDE" 9.394e+12 8.062592e+12 43263 "ILC TECHNOLOGY, INC." 44965410 1.325e+12 "AUDITS & SURVEYS WORLDWIDE" 9.394e+12 8.069e+12 64654 "PUBLISHERS EQUIPMENT" 74465010 1.332375e+12 "AUDITS & SURVEYS WORLDWIDE" 9.394e+12 8.061625e+12 18738 "BONRAY DRILLING" 9852310 1.325e+12 "AUDITS & SURVEYS WORLDWIDE" 9.394e+12 8.069e+12 81219 "VODAVI TECHNOLOGY" 92890310 1.32825e+12 "AUDITS & SURVEYS WORLDWIDE" 9.394e+12 8.06575e+12 69957 "SELECTERM, INC." 81628510 1.333125e+12 "AUDITS & SURVEYS WORLDWIDE" 9.394e+12 8.060876e+12 47352 "KELLY-JOHNSTON ENTERPR" 48811910 8.10e+11 "BAKER (MICHAEL) CORP" 5.725e+12 4.915e+12 55750 "NCA" 62878710 8.1225e+11 "BAKER (MICHAEL) CORP" 5.725e+12 4.91275e+12 71942 "SPAN-AMERICAN MEDICAL" 84639610 8.096e+11 "BAKER (MICHAEL) CORP" 5.725e+12 4.9154003e+12 30227 "DISTRIBUTED LOGIC" 25490610 8.14e+11 "BAKER (MICHAEL) CORP" 5.725e+12 4.911e+12 16206 "BIW CABLE SYSTEMS" 5547510 8.1614e+11 "BAKER (MICHAEL) CORP" 5.725e+12 4.9088603e+12 59636 "1 POTATO 2, INC." 68241010 8.1125e+11 "BAKER (MICHAEL) CORP" 5.725e+12 4.91375e+12 21937 "CELLULAR TECHNOLOGY" 15116810 8.125e+11 "BAKER (MICHAEL) CORP" 5.725e+12 4.9125e+12 65699 "NI INDUSTRIES" 62913510 1.185e+13 "BANCWEST CORP" 8.34075e+13 7.15575e+13 37402 "L.B. FOSTER" 35006010 2.89e+12 "BANGOR HYDRO-ELECTRIC CO" 2.0319e+13 1.7429e+13 49411 "LIEBERT" 53173510 2.8875e+12 "BANGOR HYDRO-ELECTRIC CO" 2.0319e+13 1.74315e+13 79258 "U.S. TELEPHONE, INC." 91272010 2.8975995e+12 "BANGOR HYDRO-ELECTRIC CO" 2.0319e+13 1.74214e+13 30091 "DIMIS" 25434910 3.1875e+11 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.92125e+12 65103 "UNITEL VIDEO" 91325310 3.20e+11 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.92e+12 61437 "PATHFINDER PETROLEUM" 70290010 318909382656 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.9210906e+12 65155 "COMPUTRAC INSTRUM." 20591710 3.19125e+11 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.920875e+12 42578 "HORNBECK OFFSHORE SVS" 44054210 3.16875e+11 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.923125e+12 25639 "COMPU-PLAN, INC." 20476120 3.162096e+11 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.9237904e+12 48857 "LASERMETRICS" 51807910 3.20e+11 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.92e+12 36760 "FLEXIBLE COMPUTER" 33938010 3.1875e+11 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.92125e+12 28338 "DAIG" 23390210 3.20e+11 "BANK OF COMMONWEALTH-DETROIT" 2.24e+12 1.92e+12 64072 "PRIMAGES, INC." 74154910 1.0395e+12 "BARCO OF CALIFORNIA" 7.303125e+12 6.263625e+12 end
However, if we use this code it will keep the first matched firm to many firms. Is there a way to avoid repetitions in the matching firms?
0 Response to How to avoid duplicates using rangejoin?
Post a Comment