I am currently writing a code to capture the differences in earnings management between US firms and cross-listed firms (foreign firms on an American stock exchange). Because the cross-listed firms are self-selected, the data might be biased. Therefore, I have to match the cross-listed firms with US firms based on:
- mtb (market-to-book ratio)
- roa (return on assets)
- at (total assets)
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long gvkey double fyear float(dummy_usa dummy_foreign) double at float(roa mtb) 1004 1996 1 0 529.584 .04347752 . 1004 1997 1 0 670.559 .05317504 . 1004 1998 1 0 726.63 .05734831 1.6586404 1004 1999 1 0 740.998 .04745357 1.0978953 1004 2000 1 0 701.854 .02640293 1.1084794 1004 2001 1 0 710.199 -.08298942 1.1752149 1004 2002 1 0 686.621 -.018074017 .4858825 1004 2003 1 0 709.292 .004940137 1.0239426 1004 2004 1 0 732.23 .021104025 1.6606493 1004 2005 1 0 978.819 .035923906 2.0879886 1004 2006 1 0 1067.633 .05494398 2.4809506 1004 2007 1 0 1362.01 .0551714 1.2772952 1004 2008 1 0 1377.511 .05709646 .8701464 1004 2009 1 0 1501.042 .029731346 1.0421851 1004 2010 1 0 1703.727 .04098427 1.2568352 1004 2011 1 0 2195.653 .03084413 .56036645 1004 2012 1 0 2136.9 .02573822 .8591657 1004 2013 1 0 2199.5 .033143897 .9606355 1004 2014 1 0 1515 .006732673 1.2381912 1004 2015 1 0 1442.1 .033076763 .9731014 1010 1997 1 0 3181.3 .068022504 . 1010 1998 1 0 3257.3 .020630583 . 1010 1999 1 0 3563.4 .021692766 . 1010 2000 1 0 3794.5 .021056794 . 1010 2001 1 0 3723.1 .03913406 . 1010 2002 1 0 3702.5 .021525996 . 1010 2003 1 0 4832.1 .07294965 . 1013 1997 1 0 936.303 .11624122 . 1013 1998 1 0 1300.587 .11281598 3.393175 1013 1999 1 0 1672.529 .0523967 5.735749 1013 2000 1 0 3970.5 .21863745 5.652886 1013 2001 1 0 2499.7 -.51514184 1.903243 1013 2002 1 0 1144.2 -1.0006992 1.725441 1013 2003 1 0 1296.9 -.05914103 3.3024726 1013 2004 1 0 1428.1 .01148379 2.715488 1013 2005 1 0 1535 .07211726 2.6268575 1013 2006 1 0 1611.4 .040772 1.9200138 1013 2007 1 0 1764.8 .06023346 2.1825328 1013 2008 1 0 1921 -.021811556 .7718683 1013 2009 1 0 1343.6 -.3530068 2.2617743 1013 2010 1 0 1474.5 .04204815 2.835 1019 1997 1 0 26.71 .0426432 . 1019 1998 1 0 29.283 .05624424 2.919797 1019 1999 1 0 29.341 .03489997 2.787749 1019 2000 1 0 28.638 .06152664 2.3041565 1019 2001 1 0 30.836 .04183422 3.302927 1021 1997 1 0 20.516 .07550205 . 1021 1998 1 0 18.661 -.17833985 1.0966128 1021 1999 1 0 13.986 -.15780066 .6230607 1021 2000 1 0 11.608 -.06960717 1.0197082 1021 2001 1 0 8.635 -.2012739 1.0919029 1021 2002 1 0 7.85 .010700637 .55830675 1021 2003 1 0 6.044 -.25066182 1.194015 1021 2004 1 0 6.245 .2153723 5.218199 1021 2005 1 0 8.153 .23304304 4.236929 1021 2006 1 0 14.341 .0700788 2.719383 1021 2007 1 0 27.171 -.17198484 2.0490286 1021 2008 1 0 21.401 -.5162843 1.2018434 1034 1997 1 0 631.866 .027550146 . 1034 1998 1 0 908.936 .02663664 3.564293 1034 1999 1 0 1160.266 .031865105 2.587424 1034 2000 1 0 1610.435 .034467705 2.080925 1034 2001 1 0 2390.008 -.015863545 1.314704 1034 2002 1 0 2296.924 -.0433889 .6115829 1034 2003 1 0 2329.268 .005938776 .9239342 1034 2004 1 0 2003.842 -.15706678 1.0131919 1034 2005 1 0 1623.383 .08240138 1.6793386 1034 2006 1 0 927.239 .08902128 1.434651 1034 2007 1 0 1288.165 -.010542904 1.206971 1036 1997 0 1 1778.547 .07757906 . 1036 1998 0 1 2113.32 .04717128 .9201303 1036 1999 0 1 2241.575 .03966407 .8469118 1036 2000 0 1 2325.377 .02431864 .51730186 1037 1996 1 0 4.969 -.555645 . 1037 1997 1 0 5.45 .1719266 . 1037 1998 1 0 3.228 -1.078067 15.407714 1037 1999 1 0 4.575 -.2450273 72.605804 1037 2000 1 0 6.373 .18264553 6.106415 1037 2001 1 0 17.867 .0374993 3.4009595 1038 1996 1 0 718.213 .026447587 . 1038 1997 1 0 795.78 -.03078615 . 1038 1998 1 0 975.73 -.016414378 3.127864 1038 1999 1 0 1188.805 -.04642225 2.0251205 1038 2000 1 0 1047.264 -.10110727 -2.8141334 1038 2001 1 0 1279.17 -.008965189 1.7854867 1038 2002 1 0 1491.698 -.013609993 1.0782552 1038 2003 1 0 1506.534 -.007111688 2.0165327 1043 1997 1 0 44.9 .0022939867 . 1043 1998 1 0 45.639 .02346677 -1.848265 1043 1999 1 0 42.21 -.0032693674 -.6050181 1045 1997 1 0 20915 .04709538 . 1045 1998 1 0 22303 .05891584 1.43031 1045 1999 1 0 24374 .04041192 1.4482962 1045 2000 1 0 26213 .031015145 .8304026 1045 2001 1 0 32841 -.05365245 .6411717 1045 2002 1 0 30267 -.11600093 1.0764759 1045 2003 1 0 29330 -.04186839 44.9258 1045 2004 1 0 28773 -.026448406 -3.0372775 1045 2005 1 0 29495 -.02919139 -2.748398 1045 2006 1 0 29145 .007925888 -11.08553 end
I currently have 5,208 cross-listed firms, and I want to reduce the amount of American firms (16,663) to the same amount (total amount of observations is 186,587).
My question looks similar to the problem discussed here: https://www.statalist.org/forums/for...with-firm-size
However, if I follow this, I end up with only 326 observations. Moreover, if I follow the commands in the post above, I end up with the American firms in the same row as their matched cross-listed firms. However, what I want is to get rid of all the unmatched American observations, and keep a dataset where the matched US and cross-listed firms are not in the same observation, so that I can still run regressions on them.
Perhaps what I mean is not exactly called 'matching'. Anyway, I am looking to keep one American firm for each cross-listed firm, that is most similar in terms of mtb, roa and at.
Also, I might have to sort firms on years first (that is: for observations/firms to be matched, the main criteria is that the observations are in the same year). If that is necessary, how can I adapt the code?
Thank you in advance.
0 Response to Matching firms based on size, performance and market-to-book ratio
Post a Comment