I have built a panel dataset out of a yearly government survey that measures the quality of an institution that exists in all Brazilian municipalities (n=5,500). My yearly data spans 2013-2016. I want to create an index that represents the prevalence of certain types of accountability actions performed by each institution for each year in each municipality. This index will be built from responses from four items with ordinal responses. Two items are binary and two items are (0,1,2).

If I can confirm strong factoral invariance using CFA in SEM, I'd like to use predict, latent to extract the index, which I will then use as the dependent variable in a fixed effects panel model.

I am trying to follow Little's (2013) advice that suggests using nested models with increasing constraints to test for "strong factoral invariance" in panel CFA. Steps 2 and 3 are nested models with additional constraints. If the model maintains a good fit through the three steps, I have evidence of "strong factoral invariance" and the index is appropriate to use. Little's worked examples are not in Stata, and I am not certain if I have performed the tests correctly in Stata. I am happy to share the code and/or output, but it is quite long, so I'm hoping people can weigh in based on the .stem files attached below.

These are the steps I have attempted to follow.
  1. Configurational invariance - no constraints
  2. Weak factoral invariance - constrain each component to have equal loadings across years.
  3. Strong factoral invariance - same as step 2 and also constrain equality of item means and latent variables across years.
You can see how I attempted to perform the three steps in the .stem models attached. The thing I am most uncertain of is step 3. In his examples, Little uses what seems to be a wave dummy to constrain for item and latent variable invariance over time. I can't figure out how to do this with wide data in Stata's SEM. I tried generating four wave dummies and adding them to the model, but it does not run. I read something elsewhere that led me to believe I can achieve the same effect by using an equality constraint on each of the four items across years and on the latent variable across years. I place equality constraints on the intercepts of the four items (each item has an equal intercept across years).

Does anyone know if what I have done is an adequate replication of Little's test for strong invariance?

If so, my fit stats are adequate. Can I use predict, latent to extract the index?

Does anyone see any other problem with my choice in method? Does anyone have an alternative suggestion on a more robust way to proceed?

cfa configurational invariance.stsem

cfa weak invariance.stsem

cfa strong invariance.stsem