Hi Statalist.

I want to be able to test if there is a difference in the effect of level of education by gender. Here's my draft code.
Code:
gen male_educ = 1 if edhigh1 == 9  // up to year 11 "11 years"
replace male_educ = 2 if (edhigh1 == 8 | p_edhigh1 == 8) & (hgsex == 1 | p_hgsex == 1) // year 12 "12 years"
replace male_educ = 3 if edhigh1 == 5 | p_edhigh1 == 5 & (hgsex == 1 | p_hgsex == 1) // cert 3, cert 4 "13 years"
replace male_educ = 4 if edhigh1 == 4 | p_edhigh1 == 4 & (hgsex == 1 | p_hgsex == 1) // adv dip, diploma "14 years"
replace male_educ = 5 if edhigh1 == 3 | p_edhigh1 == 3 & (hgsex == 1 | p_hgsex == 1) // bachelor, honours "18-19 years"
replace male_educ = 6 if edhigh1 == 2 | p_edhigh1 == 2 & (hgsex == 1 | p_hgsex == 1) // grad diploma, grad cert "19-20 years"
replace male_educ = 7 if edhigh1 == 1 | p_edhigh1 == 1 & (hgsex == 1 | p_hgsex == 1) // masters, doctorate "20-24 years"
I then repeat the same code for females:
Code:
gen fem_educ = 1 if edhigh1 == 9 | p_edhigh1 == 9 & (hgsex == 2 | p_hgsex == 2) // up to year 11 "11 years"
replace fem_educ = 2 if edhigh1 == 8 | p_edhigh1 == 8 & (hgsex == 2 | p_hgsex == 2) // year 12 "12 years"
replace fem_educ = 3 if edhigh1 == 5 | p_edhigh1 == 5 & (hgsex == 2 | p_hgsex == 2) // cert 3, cert 4 "13 years"
replace fem_educ = 4 if edhigh1 == 4 | p_edhigh1 == 4 & (hgsex == 2 | p_hgsex == 2) // adv dip, diploma "14 years"
replace fem_educ = 5 if edhigh1 == 3 | p_edhigh1 == 3 & (hgsex == 2 | p_hgsex == 2) // bachelor, honours "18-19 years"
replace fem_educ = 6 if edhigh1 == 2 | p_edhigh1 == 2 & (hgsex == 2 | p_hgsex == 2) // grad diploma, grad cert "19-20 years"
replace fem_educ = 7 if edhigh1 == 1 | p_edhigh1 == 1 & (hgsex == 2 | p_hgsex == 2) // masters, doctorate "20-24 years"
Sample data:
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long(id p_id) byte(wave edhigh1 p_edhigh1  hgsex p_hgsex)
101  102  1 5 9 1 2
101  102  2 5 9 1 2
101  102  3 5 9 1 2
101  102  4 5 9 1 2
103  104  1 9 5 2 1
103  104  2 9 5 2 1
103  104  3 9 5 2 1
103  104  4 9 5 2 1
106 142 11 5 5 2 1
106 142 12 5 5 2 1
106 142 13 5 5 2 1
106 142 14 5 5 2 1
106 142 15 5 5 2 1
106 142 16 5 5 2 1
106 142 17 5 5 2 1
106 142 18 5 5 2 1
110 163 12 1 3 1 2
110 163 13 1 3 1 2
110 163 14 1 3 1 2
110 163 15 1 3 1 2
110 163 16 1 3 1 2
110 163 17 1 3 1 2
110 163 18 1 3 1 2
111  231  6 9 4 2 1
111  231  7 9 4 2 1
111  231  8 9 4 2 1
111  231  9 9 4 2 1
end
I would appreciate help correcting/improving this code.

(New variable based on edhigh1 - tabulated below): Array


N.B. Stata v.15.1. Using panel data. variables are differentiated by respondent and their partner - "p_" represents value for partner.