Hello there,

i am trying to create a correlation matrix with several variables. One of my variables from the data set contains the GIC industry code. In my matrix the correlation refers to the total variable "Industry" (The variable has been split so that only the first two digits represent the industry category). Is it possible to divide this variable into sub-variables (see the first picture) in the correlations matrix, so that I can see the correlations for each industry (and not for the total variable "Industry")?

I am using Stata 16.1.

Thanks for your help!
Best regards
Sven

Code:
    *Calculate R&D Intensity (RDI)
    gen rdi=xrd/revt

    *Set a panel for the data 
    xtset gvkey fyear 

    *Generate lagged variables
    by gvkey: generate patent_app_count_l3=patent_app_count[_n-3]    
    by gvkey: generate cit_forw_l3= cit_forw[_n-3]    
    by gvkey: generate c_ma_deal_l1= c_ma_deal[_n-1]
    
    *Winsorize revenue at 1% level
    winsor revt, gen(revt_w) p(0.01)    

    *Winsorize R&D Intensity at 1% level
    winsor rdi, gen(rdi_w) p(0.01)

    *Calculate logarithms for R&D Intensity, revenue 
    gen ln_rdi_w = ln(rdi_w)
    gen ln_revt_w = ln(revt_w)
    
    *Simplification of the “Gind” (GIC Industry) variable
    gen industry_1 = substr(gind, 1,2)
    encode industry_1, generate(industry)

    *Tabulate the correlations matrix    
    pwcorr c_cvc_deal patent_app_count cit_forw ln_revt_w ln_rdi_w c_ma_deal_l1 industry, obs sig star(0.05)
Array

Array