I'm trying to do a somewhat tricky mediation test. Basically, I'm looking at the relationship between parental education (4 levels) and the risk of self-harm (binary) in college students, and I want to know if the "academic rank" of the student's college explains part of this association. For those curious, academic rank is measured by the NCES-Barron's Admissions Competitiveness Index. Here, I'm testing mediation by comparing the bottom category (High school degree) to the top (Advanced degree).

I need to adjust for clustering by school but also account for the fact that the mediator is school-specific, not student-specific. Here's what I originally tried:

Code:
svyset schoolID [pw=nrweight] // students are nested in schools, and nrweight adjusts for nonresponse based on each school's demographics

local cov "i.survey_year i.gender i.race age_10cat"

* in the following, sib_any is the outcome, and it stands for self-injurious behavior
* inst_acarank is a 6-level ordinal measure of academic rank (basically, how hard it is to get into that university)
* maxed4_merged is a 4-level measure of parental education
* below, subpop(deg_bach) restricts my test to bachelor's degree students

gen sib_any2 = sib_any
svy, subpop(deg_bach): gsem (sib_any <- i.maxed4_merged `cov', logit) ///
           (sib_any2 <- i.maxed4_merged `cov' i.inst_acarank, logit) ///
           if inst_acarank!=.

                                      
margins, dydx(4.maxed4_merged) post
mlincom 1-2
But I'm thinking I need to do something that actually addresses the multilevel nature of the data. Something like:

Code:
eststo m2: mixed sib_any i.maxed4_merged i.gender i.race i.survey_year age_10cat [pw=nrweight] ///
           || inst_acarank: || schoolID_new:
eststo m1: mixed sib_any i.maxed4_merged i.gender i.race i.survey_year age_10cat [pw=nrweight] || ///
           schoolID_new: if e(sample)
... and then compare the coefficient on 4.maxed4_merged (High school vs. Advanced degree). Btw, I don't know if I need to say
Code:
 ,vce(cluster schoolID)
for these mixed effects models.

What do you think is the best approach? I'm open to all possibilities.