I acutally have a problem with clustering and building the mean.
My dataset contains the variable sic (which ist the Standard Identification Code), i want to calculate the mean of variable "tq" per sic cluster, per year.
We cluster as following:
*Agriculture, forestry and fishing:
sic >=0100 & sic <=0999
*Mining:
sic >=1000 & sic <=1499
*Construction:
sic >=1500 & sic <=1799
*Manufacturing:
sic >=2000 & sic <=3999
*Transportation, communications, electric, gas and sanitary service:
sic >=4000 & sic <=4999
*Wholesale trade:
sic >=5000 & sic <=5199
*Retail Trade:
sic >=5200 & sic <=5999
*Finance, Insurance and Real Estate:
sic >=6000 & sic <=6799
*Services:
sic >=7000 & sic <=8999
*Public Administration:
sic >=9100 & sic <=9729
*Nonclassifiable
>=9900 & sic <=9999
It would me nice if anyone can help me with the code.
Thanks in advance!
Best regards
Jana
----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long gvkey int fyear double tq int sic 1013 2005 1169.2 3661 1013 2006 1281.9 3661 1013 2007 1322.2 3661 1045 2015 40990 4512 1045 2016 40180 4512 1045 2017 42207 4512 1045 2018 44541 4512 1045 2019 45768 4512 1045 2020 17337 4512 1075 2005 2987.955 4911 1075 2006 3401.748 4911 1075 2007 3523.62 4911 1075 2008 3367.076 4911 1075 2009 3297.101 4911 1075 2010 3263.645 4911 1075 2011 3241.379 4911 1075 2012 3301.804 4911 1075 2013 3454.628 4911 1075 2014 3491.632 4911 1075 2015 3495.443 4911 1075 2016 3498.682 4911 1075 2017 3565.296 4911 1075 2018 3691.247 4911 1075 2019 3471.209 4911 1075 2020 3586.982 4911 1078 2005 22287.808 3845 1078 2006 22476.322 3845 1078 2007 25914.238 3845 1078 2008 29527.552 3845 1078 2009 30764.707 3845 1078 2010 35166.721 3845 1078 2011 38851.259 3845 1078 2012 39873.91 3845 1078 2013 21848 3845 1078 2014 20247 3845 1078 2015 20405 3845 1078 2016 20853 3845 1078 2017 27390 3845 1078 2018 30578 3845 1078 2019 31904 3845 1078 2020 34608 3845 1161 2005 5847.577 3674 1161 2006 5649 3674 1161 2007 6013 3674 1161 2008 5808 3674 1161 2009 5403 3674 1161 2010 6494 3674 1161 2011 6568 3674 1161 2012 5422 3674 1161 2013 5299 3674 1177 2005 22491.9 6324 1177 2006 25145.7 6324 1177 2007 27599.6 6324 1177 2008 30950.7 6324 1177 2009 34733.9 6324 1177 2010 34246 6324 1177 2011 33779.8 6324 1177 2012 36595.9 6324 1177 2013 47284.9 6324 1177 2014 58003.2 6324 1177 2015 60226.9 6324 1177 2016 63155 6324 1177 2017 60447 6324 1209 2005 8143.5 2810 1209 2006 8850.4 2810 1209 2007 10037.8 2810 1209 2008 10414.5 2810 1209 2009 8256.2 2810 1209 2010 9026 2810 1209 2011 10082 2810 1209 2012 9611.7 2810 1209 2013 10180.4 2810 1209 2014 10439 2810 1209 2015 9894.9 2810 1209 2016 9524.4 2810 1209 2017 8187.6 2810 1209 2018 8930.2 2810 1209 2019 8918.9 2810 1209 2020 8856.3 2810 1230 2016 5931 4512 1230 2017 7933 4512 1230 2018 8264 4512 1230 2019 8781 4512 1230 2020 3566 4512 1239 2005 3531.231 2844 1239 2006 3772.001 2844 1240 2005 40358 5411 1279 2005 3037.887 4911 1279 2006 3121.489 4911 1279 2007 3307.02 4911 1279 2008 3385.916 4911 1279 2009 3426.8 4911 1279 2010 3902.9 4911 1300 2005 27653 9997 1300 2006 31367 9997 1300 2007 34589 9997 1300 2008 36556 9997 1300 2009 30908 9997 1300 2010 33370 9997 1300 2011 36529 9997 end
0 Response to Clustering variables
Post a Comment