I have census data for households in several countries. I would like to estimate assortative mating statistics (in terms of mothers' and fathers' highest educational attainment) for each geographical region and (birth) cohort of people.
In my data, country, geolev1 and geolev2 represent unique country and within-country regional identifiers. I need the results from all levels, hence I use all three regional variables. However, the educational attainment variables (mothereduc, fathereduc) are categorical: less than primary, primary school etc. so I have to estimate the polychoric correlation.
I would like to create a .dta file which contains the estimated polychoric correlation between parental education for each cohort-region cell. The code I am currently using is:
Code:
foreach admUnit of varlist country geolev1 geolev2{ sort `admUnit' drop if `admUnit' == . * Creating results matrix matrix R = (0, 0, 0) * Looping over all possible values of variables in the groups egen group = group(`admUnit' cohort) sum group, meanonly forvalues i=1/`r(max)' { * Run the polychoric correlation capture polychoric mothereduc fathereduc [aw=perwt] if(group == `i'), nolog pw * Error handling - if there is no variation in parental education in a region if _rc==0{ local result=r(rho) } if _rc!=0{ local result=r(rho) } * Get the region and sample ID quietly: sum `admUnit' if(group == `i') local admName=r(mean) quietly: sum cohort if(group == `i') local cohort=r(mean) * Create matrix from results matrix input T = (`result', `admName', `cohort') * Join current results to all results matrix R = R \ T } * Load output data from result matrix matrix colnames R = assort_mating `admUnit' cohort * Save output clear svmat R, names(col) save "AssortMating2_byCohort_`admUnit'.dta", replace use "datafile", clear }
Due to my contractual obligations, I cannot share extracts from the dataset, however I can provide more detailed variable description:
Country, geolev1, geolev2 - Unique identifiers for each country/administrative regions within countries
cohort - Birth cohort of the older spouse. Grouped into bins such as "1960" for people born 1960-1969, "1970" for people born 1970-79 etc.
polychoric - user-written package to calculate polychoric correlations
mothereduc/fathereduc - categorical (1-4) variable of highest educational qualification of the person.
Thank you in advance for your help,
Marcell
0 Response to Invalid matrix operation - polychoric correlation
Post a Comment