Hello!

I have census data for households in several countries. I would like to estimate assortative mating statistics (in terms of mothers' and fathers' highest educational attainment) for each geographical region and (birth) cohort of people.

In my data, country, geolev1 and geolev2 represent unique country and within-country regional identifiers. I need the results from all levels, hence I use all three regional variables. However, the educational attainment variables (mothereduc, fathereduc) are categorical: less than primary, primary school etc. so I have to estimate the polychoric correlation.

I would like to create a .dta file which contains the estimated polychoric correlation between parental education for each cohort-region cell. The code I am currently using is:
Code:
foreach admUnit of varlist country geolev1 geolev2{    
    sort `admUnit'
    drop if `admUnit' == .
    
    * Creating results matrix
    matrix R = (0, 0, 0)
    
    * Looping over all possible values of variables in the groups
    egen group = group(`admUnit' cohort)
    sum group, meanonly
    forvalues i=1/`r(max)' {
        * Run the polychoric correlation
        capture polychoric mothereduc fathereduc [aw=perwt] if(group == `i'), nolog pw
        
        * Error handling - if there is no variation in parental education in a region
        if _rc==0{
            local result=r(rho)
        }
        if _rc!=0{
            local result=r(rho)
        }
        
        * Get the region and sample ID
        quietly: sum `admUnit' if(group == `i')
        local admName=r(mean)
        quietly: sum cohort if(group == `i')
        local cohort=r(mean)
        
        * Create matrix from results
        matrix input T = (`result', `admName', `cohort')
        
        * Join current results to all results
        matrix R = R \ T
    }
    * Load output data from result matrix
    matrix colnames R = assort_mating `admUnit' cohort
    
    * Save output
    clear
    svmat R, names(col)
    save "AssortMating2_byCohort_`admUnit'.dta", replace
    
    use "datafile", clear
}
This runs fine on my computer on Stata 16.1, however I would need to run it on a computing cluster which only supports Stata 15. I get an error code "invalid matrix operation" when I try to run the code above. I suspect the line "matrix R = R \ T" is causing issues, but I was wondering if anyone can offer me better insight on this problem.

Due to my contractual obligations, I cannot share extracts from the dataset, however I can provide more detailed variable description:
Country, geolev1, geolev2 - Unique identifiers for each country/administrative regions within countries
cohort - Birth cohort of the older spouse. Grouped into bins such as "1960" for people born 1960-1969, "1970" for people born 1970-79 etc.
polychoric - user-written package to calculate polychoric correlations
mothereduc/fathereduc - categorical (1-4) variable of highest educational qualification of the person.

Thank you in advance for your help,
Marcell