I have a panel data where my independent variables are highly COLLINEAR(Index1 to Index4). In that case, rather than dropping one or more of the collinear variables, is it legitimate to transform the variables so that we can retain them. I will demonstrate my data and results with example.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(Index1 Index2 Index3 Index4) long id int year float dep_var 5.46687 . . . 1 1999 0 3.5714 53.3333 49.2386 37.4359 1 2000 .0469986 3.77717 . . . 1 2001 0 3.97991 55.102 35.3535 34.1837 1 2002 0 4.09675 56.6326 44.9495 43.3674 1 2003 0 3.94243 55.665 34.6342 44.335 1 2004 0 3.94921 51.4706 33.1707 50 1 2005 0 4.05847 57.0732 37.0732 48.0392 1 2006 0 3.92085 59.2233 33.4951 50.9709 1 2007 0 4.64972 58.7379 36.4078 50.9709 1 2008 0 4.8054 57.8947 36.8421 45.933 1 2009 0 4.70902 58.3732 33.3333 44.4976 1 2010 0 4.83402 58.7678 37.9147 44.5498 1 2011 0 4.82298 57.8199 40.2844 44.0758 1 2012 0 4.66564 54.9763 44.5498 44.0758 1 2013 0 4.55899 65.3846 45.6731 43.75 1 2014 0 4.52303 68.2692 48.5577 44.2308 1 2015 0 4.86224 66.8269 49.0385 44.2308 1 2016 0 5.33097 68.2692 46.1539 48.5577 1 2017 0 5.62695 69.2308 46.1539 47.1154 1 2018 0 5.89539 71.6346 45.1923 42.7885 1 2019 0 3.5714 53.3333 49.2386 37.4359 2 2000 . 3.77717 . . . 2 2001 0 3.97991 55.102 35.3535 34.1837 2 2002 0 4.09675 56.6326 44.9495 43.3674 2 2003 0 3.94243 55.665 34.6342 44.335 2 2004 . 3.94921 51.4706 33.1707 50 2 2005 . 4.05847 57.0732 37.0732 48.0392 2 2006 .5771455 3.92085 59.2233 33.4951 50.9709 2 2007 . 4.64972 58.7379 36.4078 50.9709 2 2008 . 4.8054 57.8947 36.8421 45.933 2 2009 0 4.70902 58.3732 33.3333 44.4976 2 2010 0 4.83402 58.7678 37.9147 44.5498 2 2011 0 4.82298 57.8199 40.2844 44.0758 2 2012 0 4.66564 54.9763 44.5498 44.0758 2 2013 0 4.55899 65.3846 45.6731 43.75 2 2014 0 4.52303 68.2692 48.5577 44.2308 2 2015 0 4.86224 66.8269 49.0385 44.2308 2 2016 0 5.33097 68.2692 46.1539 48.5577 2 2017 0 5.62695 69.2308 46.1539 47.1154 2 2018 . 5.89539 71.6346 45.1923 42.7885 2 2019 0 5.46687 . . . 3 1999 0 3.5714 53.3333 49.2386 37.4359 3 2000 . 3.77717 . . . 3 2001 . 3.97991 55.102 35.3535 34.1837 3 2002 . 4.09675 56.6326 44.9495 43.3674 3 2003 . 3.94243 55.665 34.6342 44.335 3 2004 . 3.94921 51.4706 33.1707 50 3 2005 . 4.05847 57.0732 37.0732 48.0392 3 2006 . 3.92085 59.2233 33.4951 50.9709 3 2007 0 4.64972 58.7379 36.4078 50.9709 3 2008 0 4.8054 57.8947 36.8421 45.933 3 2009 . 4.70902 58.3732 33.3333 44.4976 3 2010 0 4.83402 58.7678 37.9147 44.5498 3 2011 0 4.82298 57.8199 40.2844 44.0758 3 2012 0 4.66564 54.9763 44.5498 44.0758 3 2013 . 4.55899 65.3846 45.6731 43.75 3 2014 0 4.52303 68.2692 48.5577 44.2308 3 2015 . 4.86224 66.8269 49.0385 44.2308 3 2016 0 5.33097 68.2692 46.1539 48.5577 3 2017 0 5.62695 69.2308 46.1539 47.1154 3 2018 0 5.89539 71.6346 45.1923 42.7885 3 2019 0 5.46687 . . . 4 1999 0 3.5714 53.3333 49.2386 37.4359 4 2000 0 3.77717 . . . 4 2001 0 3.97991 55.102 35.3535 34.1837 4 2002 0 4.09675 56.6326 44.9495 43.3674 4 2003 . 3.94243 55.665 34.6342 44.335 4 2004 0 3.94921 51.4706 33.1707 50 4 2005 0 4.05847 57.0732 37.0732 48.0392 4 2006 0 3.92085 59.2233 33.4951 50.9709 4 2007 0 4.64972 58.7379 36.4078 50.9709 4 2008 0 4.8054 57.8947 36.8421 45.933 4 2009 0 4.70902 58.3732 33.3333 44.4976 4 2010 0 4.83402 58.7678 37.9147 44.5498 4 2011 0 4.82298 57.8199 40.2844 44.0758 4 2012 0 4.66564 54.9763 44.5498 44.0758 4 2013 0 4.55899 65.3846 45.6731 43.75 4 2014 0 4.52303 68.2692 48.5577 44.2308 4 2015 0 4.86224 66.8269 49.0385 44.2308 4 2016 0 5.33097 68.2692 46.1539 48.5577 4 2017 0 5.62695 69.2308 46.1539 47.1154 4 2018 0 5.89539 71.6346 45.1923 42.7885 4 2019 0 5.46687 . . . 5 1999 . 3.5714 53.3333 49.2386 37.4359 5 2000 . 3.77717 . . . 5 2001 0 3.97991 55.102 35.3535 34.1837 5 2002 0 4.09675 56.6326 44.9495 43.3674 5 2003 0 3.94243 55.665 34.6342 44.335 5 2004 . 3.94921 51.4706 33.1707 50 5 2005 0 4.05847 57.0732 37.0732 48.0392 5 2006 . 3.92085 59.2233 33.4951 50.9709 5 2007 0 4.64972 58.7379 36.4078 50.9709 5 2008 . 4.8054 57.8947 36.8421 45.933 5 2009 0 4.70902 58.3732 33.3333 44.4976 5 2010 0 4.83402 58.7678 37.9147 44.5498 5 2011 0 4.82298 57.8199 40.2844 44.0758 5 2012 0 4.66564 54.9763 44.5498 44.0758 5 2013 0 4.55899 65.3846 45.6731 43.75 5 2014 . 4.52303 68.2692 48.5577 44.2308 5 2015 0 end label values id id label def id 1 "000002.SZ", modify label def id 2 "000004.SZ", modify label def id 3 "000005.SZ", modify label def id 4 "000006.SZ", modify label def id 5 "000007.SZ", modify
Code:
pwcorr dep_var Index1 Index2 Index3 Index4 , sig star(.01) | dep_var Index1 Index2 Index3 Index4 -------------+--------------------------------------------- dep_var | 1.0000 | | Index1 | -0.1242 1.0000 | 0.2819 | Index2 | -0.0867 0.7584* 1.0000 | 0.4757 0.0000 | Index3 | -0.0658 0.3183* 0.5552* 1.0000 | 0.5884 0.0021 0.0000 | Index4 | 0.0787 0.1896 0.1301 -0.3035* 1.0000 | 0.5172 0.0719 0.2190 0.0034 |
Code:
reg dep_var Index1 Index2 Index3 i.id i.year note: 2017.year omitted because of collinearity. note: 2018.year omitted because of collinearity. note: 2019.year omitted because of collinearity. Source | SS df MS Number of obs = 70 -------------+---------------------------------- F(22, 47) = 1.28 Model | .123540378 22 .005615472 Prob > F = 0.2345 Residual | .206200354 47 .004387242 R-squared = 0.3747 -------------+---------------------------------- Adj R-squared = 0.0819 Total | .329740732 69 .004778851 Root MSE = .06624 ------------------------------------------------------------------------------ dep_var | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- Index1 | .0607583 .2623863 0.23 0.818 -.4670948 .5886114 Index2 | -.0082303 .0411951 -0.20 0.843 -.0911041 .0746435 Index3 | .0068583 .1109061 0.06 0.951 -.216256 .2299725 | id | 000004.SZ | .0420866 .0246522 1.71 0.094 -.0075072 .0916804 000005.SZ | .0090231 .0272016 0.33 0.742 -.0456995 .0637458 000006.SZ | -.0035913 .0218879 -0.16 0.870 -.047624 .0404413 000007.SZ | .0108503 .0272195 0.40 0.692 -.0439084 .0656089 | year | 2002 | .0473329 1.50762 0.03 0.975 -2.985606 3.080272 2003 | -.0182899 .4098085 -0.04 0.965 -.8427182 .8061384 2004 | .073309 1.573491 0.05 0.963 -3.092147 3.238765 2005 | .0441975 1.844819 0.02 0.981 -3.667099 3.755494 2006 | .2388757 1.268526 0.19 0.851 -2.31307 2.790822 2007 | .1058522 1.615725 0.07 0.948 -3.144567 3.356271 2008 | .0398561 1.309828 0.03 0.976 -2.595177 2.674889 2009 | .0099532 1.289834 0.01 0.994 -2.584859 2.604765 2010 | .0444741 1.660176 0.03 0.979 -3.295369 3.384318 2011 | .0087066 1.14843 0.01 0.994 -2.301636 2.319049 2012 | -.0146761 .9171183 -0.02 0.987 -1.85968 1.830328 2013 | -.0584361 .5495863 -0.11 0.916 -1.164061 1.047189 2014 | .0264604 .1801508 0.15 0.884 -.3359562 .388877 2015 | .0321464 .3668656 0.09 0.931 -.705892 .7701847 2016 | -.0031748 .3119827 -0.01 0.992 -.630803 .6244534 2017 | 0 (omitted) 2018 | 0 (omitted) 2019 | 0 (omitted) | _cons | -.0904379 6.801528 -0.01 0.989 -13.77335 13.59247 ------------------------------------------------------------------------------ . estat vif Variable | VIF 1/VIF -------------+---------------------- Index1 | 349.95 0.002858 Index2 | 886.38 0.001128 Index3 | 6173.44 0.000162 id | 2 | 1.47 0.681963 3 | 1.45 0.691748 4 | 1.46 0.684869 5 | 1.45 0.690839 year | 2002 | 1953.88 0.000512 2003 | 109.92 0.009098 2004 | 1096.42 0.000912 2005 | 2227.48 0.000449 2006 | 1053.19 0.000949 2007 | 2244.14 0.000446 2008 | 1122.88 0.000891 2009 | 1430.15 0.000699 2010 | 2916.77 0.000343 2011 | 1395.73 0.000716 2012 | 890.11 0.001123 2013 | 259.65 0.003851 2014 | 27.90 0.035844 2015 | 115.70 0.008643 2016 | 83.67 0.011952 -------------+---------------------- Mean VIF | 1106.51
So my question is rather than dropping, can we do something to deal with Multicollinearity
0 Response to Dealing with Highly Collinear Independent Variables
Post a Comment