I would like to have an overview of the industry distribution in my sample to see which industries dominate the sample. I have listed the data below.
Anyone knows how to do this?
ggroup is the industry identifier, gvkey is the firm identifier, and datadate is the time identifier.
Also, in the next test, I would like to merge this data with another variable ( i.e. STR, at the firm-quarter level). I would like to see the top 10 industries ranked by average STR. Could you share some suggestions on how to achieve this?
Thank you very much in advance!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str6 gvkey long datadate str6(datacqtr datafqtr) str2 costat str4 ggroup str2 gsector "001004" 12842 "1995Q1" "1994Q3" "A" "2010" "20" "001004" 12934 "1995Q2" "1994Q4" "A" "2010" "20" "001004" 13026 "1995Q3" "1995Q1" "A" "2010" "20" "001004" 13117 "1995Q4" "1995Q2" "A" "2010" "20" "001004" 13208 "1996Q1" "1995Q3" "A" "2010" "20" "001004" 13300 "1996Q2" "1995Q4" "A" "2010" "20" "001004" 13392 "1996Q3" "1996Q1" "A" "2010" "20" "001004" 13483 "1996Q4" "1996Q2" "A" "2010" "20" "001004" 13573 "1997Q1" "1996Q3" "A" "2010" "20" "001004" 13665 "1997Q2" "1996Q4" "A" "2010" "20" "001004" 13757 "1997Q3" "1997Q1" "A" "2010" "20" "001004" 13848 "1997Q4" "1997Q2" "A" "2010" "20" "001004" 13938 "1998Q1" "1997Q3" "A" "2010" "20" "001004" 14030 "1998Q2" "1997Q4" "A" "2010" "20" "001004" 14122 "1998Q3" "1998Q1" "A" "2010" "20" "001004" 14213 "1998Q4" "1998Q2" "A" "2010" "20" "001004" 14303 "1999Q1" "1998Q3" "A" "2010" "20" "001004" 14395 "1999Q2" "1998Q4" "A" "2010" "20" "001004" 14487 "1999Q3" "1999Q1" "A" "2010" "20" "001004" 14578 "1999Q4" "1999Q2" "A" "2010" "20" "001004" 14669 "2000Q1" "1999Q3" "A" "2010" "20" "001004" 14761 "2000Q2" "1999Q4" "A" "2010" "20" "001004" 14853 "2000Q3" "2000Q1" "A" "2010" "20" "001004" 14944 "2000Q4" "2000Q2" "A" "2010" "20" "001004" 15034 "2001Q1" "2000Q3" "A" "2010" "20" "001004" 15126 "2001Q2" "2000Q4" "A" "2010" "20" "001004" 15218 "2001Q3" "2001Q1" "A" "2010" "20" "001004" 15309 "2001Q4" "2001Q2" "A" "2010" "20" "001004" 15399 "2002Q1" "2001Q3" "A" "2010" "20" "001004" 15491 "2002Q2" "2001Q4" "A" "2010" "20" "001004" 15583 "2002Q3" "2002Q1" "A" "2010" "20" "001004" 15674 "2002Q4" "2002Q2" "A" "2010" "20" "001004" 15764 "2003Q1" "2002Q3" "A" "2010" "20" "001004" 15856 "2003Q2" "2002Q4" "A" "2010" "20" "001004" 15948 "2003Q3" "2003Q1" "A" "2010" "20" "001004" 16039 "2003Q4" "2003Q2" "A" "2010" "20" "001004" 16130 "2004Q1" "2003Q3" "A" "2010" "20" "001004" 16222 "2004Q2" "2003Q4" "A" "2010" "20" "001004" 16314 "2004Q3" "2004Q1" "A" "2010" "20" "001004" 16405 "2004Q4" "2004Q2" "A" "2010" "20" "001004" 16495 "2005Q1" "2004Q3" "A" "2010" "20" "001004" 16587 "2005Q2" "2004Q4" "A" "2010" "20" "001004" 16679 "2005Q3" "2005Q1" "A" "2010" "20" "001004" 16770 "2005Q4" "2005Q2" "A" "2010" "20" "001004" 16860 "2006Q1" "2005Q3" "A" "2010" "20" "001004" 16952 "2006Q2" "2005Q4" "A" "2010" "20" "001004" 17044 "2006Q3" "2006Q1" "A" "2010" "20" "001004" 17135 "2006Q4" "2006Q2" "A" "2010" "20" "001004" 17225 "2007Q1" "2006Q3" "A" "2010" "20" "001004" 17317 "2007Q2" "2006Q4" "A" "2010" "20" "001004" 17409 "2007Q3" "2007Q1" "A" "2010" "20" "001004" 17500 "2007Q4" "2007Q2" "A" "2010" "20" "001004" 17591 "2008Q1" "2007Q3" "A" "2010" "20" "001004" 17683 "2008Q2" "2007Q4" "A" "2010" "20" "001004" 17775 "2008Q3" "2008Q1" "A" "2010" "20" "001004" 17866 "2008Q4" "2008Q2" "A" "2010" "20" "001004" 17956 "2009Q1" "2008Q3" "A" "2010" "20" "001004" 18048 "2009Q2" "2008Q4" "A" "2010" "20" "001004" 18140 "2009Q3" "2009Q1" "A" "2010" "20" "001004" 18231 "2009Q4" "2009Q2" "A" "2010" "20" "001004" 18321 "2010Q1" "2009Q3" "A" "2010" "20" "001004" 18413 "2010Q2" "2009Q4" "A" "2010" "20" "001004" 18505 "2010Q3" "2010Q1" "A" "2010" "20" "001004" 18596 "2010Q4" "2010Q2" "A" "2010" "20" "001004" 18686 "2011Q1" "2010Q3" "A" "2010" "20" "001004" 18778 "2011Q2" "2010Q4" "A" "2010" "20" "001004" 18870 "2011Q3" "2011Q1" "A" "2010" "20" "001004" 18961 "2011Q4" "2011Q2" "A" "2010" "20" "001004" 19052 "2012Q1" "2011Q3" "A" "2010" "20" "001004" 19144 "2012Q2" "2011Q4" "A" "2010" "20" "001004" 19236 "2012Q3" "2012Q1" "A" "2010" "20" "001004" 19327 "2012Q4" "2012Q2" "A" "2010" "20" "001004" 19417 "2013Q1" "2012Q3" "A" "2010" "20" "001004" 19509 "2013Q2" "2012Q4" "A" "2010" "20" "001004" 19601 "2013Q3" "2013Q1" "A" "2010" "20" "001004" 19692 "2013Q4" "2013Q2" "A" "2010" "20" "001004" 19782 "2014Q1" "2013Q3" "A" "2010" "20" "001004" 19874 "2014Q2" "2013Q4" "A" "2010" "20" "001004" 19966 "2014Q3" "2014Q1" "A" "2010" "20" "001004" 20057 "2014Q4" "2014Q2" "A" "2010" "20" "001004" 20147 "2015Q1" "2014Q3" "A" "2010" "20" "001004" 20239 "2015Q2" "2014Q4" "A" "2010" "20" "001004" 20331 "2015Q3" "2015Q1" "A" "2010" "20" "001004" 20422 "2015Q4" "2015Q2" "A" "2010" "20" "001004" 20513 "2016Q1" "2015Q3" "A" "2010" "20" "001004" 20605 "2016Q2" "2015Q4" "A" "2010" "20" "001004" 20697 "2016Q3" "2016Q1" "A" "2010" "20" "001004" 20788 "2016Q4" "2016Q2" "A" "2010" "20" "001004" 20878 "2017Q1" "2016Q3" "A" "2010" "20" "001004" 20970 "2017Q2" "2016Q4" "A" "2010" "20" "001004" 21062 "2017Q3" "2017Q1" "A" "2010" "20" "001004" 21153 "2017Q4" "2017Q2" "A" "2010" "20" "001004" 21243 "2018Q1" "2017Q3" "A" "2010" "20" "001004" 21335 "2018Q2" "2017Q4" "A" "2010" "20" "001004" 21427 "2018Q3" "2018Q1" "A" "2010" "20" "001004" 21518 "2018Q4" "2018Q2" "A" "2010" "20" "001009" 12814 "1994Q4" "1995Q1" "I" "1510" "15" "001009" 12903 "1995Q1" "1995Q2" "I" "1510" "15" "001009" 12995 "1995Q2" "1995Q3" "I" "1510" "15" "001010" 12873 "1995Q1" "1995Q1" "I" "2030" "20" end format %d datadate
0 Response to how to count the distribution of industry in the sample
Post a Comment