Hello,

I have a dataset with approximately 20,000 observations. 4 variables of current interest are sex(levels 1 or 2), age5cat (0/4), ethnicity (0/3), and education (0/3). I ran 2 "svy: mean variable, over(sex age5cat ethnicity education)" with the only difference being variable between the commands. This created estimates for 160 subpopulations based on the different strata for both commands. I subsequently extracted these values and divided the two matrices values using the "mata: st_matrix" command. Finally, I produced a new variable in my dataset using the "svmat" command called bmiaf. Currently, I have my dataset with its many variables and 20,000 observations for each and then the new svmat-generated bmiaf variable that only has 160 observations based on the number of strata in my svy: mean commands. I want to now truncate my dataset from 20,000 observations to only 160 and to only include the 4 strata variables (sex age5cat ethnicity education) with values that combine to represent all 160 subpopulations and which correspond to the bmiaf value in each observation (I will be merging to another dataset based on these 160 subpopulations of 4 variables).

Code:
keep sex age5cat ethnicity education bmiaf
keep in 1/160
Gives me the size of the dataset. The "svmat" command did not carry over any of the strata names, and truncating my dataset to keep the first 160 observations to fit the bmiaf observations produces a dataset with a random sample of the 160 individuals who do not represent all 160 subpopulations (and who currently have bmiaf values that are random and do not correspond to them). I now want to either recode or likely generate the "sex age5cat ethnicity education" variables to create the full list of 160 subpopulations based on every combination of these variables that also match to the bmiaf value. The order of the subpopulations was:

Over: sex age5cat ethnicity education
_subpop_1: Male 20-39 White <HS
_subpop_2: Male 20-39 White HSGED
_subpop_3: Male 20-39 White <College
_subpop_4: Male 20-39 White >College
_subpop_5: Male 20-39 Black <HS
_subpop_6: Male 20-39 Black HSGED
_subpop_7: Male 20-39 Black <College
_subpop_8: Male 20-39 Black >College
_subpop_9: Male 20-39 Hispanic <HS
_subpop_10: Male 20-39 Hispanic HSGED
_subpop_11: Male 20-39 Hispanic <College
_subpop_12: Male 20-39 Hispanic >College
_subpop_13: Male 20-39 Other <HS
_subpop_14: Male 20-39 Other HSGED
_subpop_15: Male 20-39 Other <College
_subpop_16: Male 20-39 Other >College
_subpop_17: Male 40-49 White <HS
_subpop_18: Male 40-49 White HSGED
_subpop_19: Male 40-49 White <College
_subpop_20: Male 40-49 White >College
_subpop_21: Male 40-49 Black <HS
_subpop_22: Male 40-49 Black HSGED
_subpop_23: Male 40-49 Black <College
_subpop_24: Male 40-49 Black >College
_subpop_25: Male 40-49 Hispanic <HS
_subpop_26: Male 40-49 Hispanic HSGED
_subpop_27: Male 40-49 Hispanic <College
_subpop_28: Male 40-49 Hispanic >College
_subpop_29: Male 40-49 Other <HS
_subpop_30: Male 40-49 Other HSGED
_subpop_31: Male 40-49 Other <College
_subpop_32: Male 40-49 Other >College
_subpop_33: Male 50-59 White <HS
_subpop_34: Male 50-59 White HSGED
_subpop_35: Male 50-59 White <College
_subpop_36: Male 50-59 White >College
_subpop_37: Male 50-59 Black <HS
_subpop_38: Male 50-59 Black HSGED
_subpop_39: Male 50-59 Black <College
_subpop_40: Male 50-59 Black >College
_subpop_41: Male 50-59 Hispanic <HS
_subpop_42: Male 50-59 Hispanic HSGED
_subpop_43: Male 50-59 Hispanic <College
_subpop_44: Male 50-59 Hispanic >College
_subpop_45: Male 50-59 Other <HS
_subpop_46: Male 50-59 Other HSGED
_subpop_47: Male 50-59 Other <College
_subpop_48: Male 50-59 Other >College
_subpop_49: Male 60-69 White <HS
_subpop_50: Male 60-69 White HSGED
_subpop_51: Male 60-69 White <College
_subpop_52: Male 60-69 White >College
_subpop_53: Male 60-69 Black <HS
_subpop_54: Male 60-69 Black HSGED
_subpop_55: Male 60-69 Black <College
_subpop_56: Male 60-69 Black >College
_subpop_57: Male 60-69 Hispanic <HS
_subpop_58: Male 60-69 Hispanic HSGED
_subpop_59: Male 60-69 Hispanic <College
_subpop_60: Male 60-69 Hispanic >College
_subpop_61: Male 60-69 Other <HS
_subpop_62: Male 60-69 Other HSGED
_subpop_63: Male 60-69 Other <College
_subpop_64: Male 60-69 Other >College
_subpop_65: Male 70+ White <HS
_subpop_66: Male 70+ White HSGED
_subpop_67: Male 70+ White <College
_subpop_68: Male 70+ White >College
_subpop_69: Male 70+ Black <HS
_subpop_70: Male 70+ Black HSGED
_subpop_71: Male 70+ Black <College
_subpop_72: Male 70+ Black >College
_subpop_73: Male 70+ Hispanic <HS
_subpop_74: Male 70+ Hispanic HSGED
_subpop_75: Male 70+ Hispanic <College
_subpop_76: Male 70+ Hispanic >College
_subpop_77: Male 70+ Other <HS
_subpop_78: Male 70+ Other HSGED
_subpop_79: Male 70+ Other <College
_subpop_80: Male 70+ Other >College
_subpop_81: Female 20-39 White <HS
_subpop_82: Female 20-39 White HSGED
_subpop_83: Female 20-39 White <College
_subpop_84: Female 20-39 White >College
_subpop_85: Female 20-39 Black <HS
_subpop_86: Female 20-39 Black HSGED
_subpop_87: Female 20-39 Black <College
_subpop_88: Female 20-39 Black >College
_subpop_89: Female 20-39 Hispanic <HS
_subpop_90: Female 20-39 Hispanic HSGED
_subpop_91: Female 20-39 Hispanic <College
_subpop_92: Female 20-39 Hispanic >College
_subpop_93: Female 20-39 Other <HS
_subpop_94: Female 20-39 Other HSGED
_subpop_95: Female 20-39 Other <College
_subpop_96: Female 20-39 Other >College
_subpop_97: Female 40-49 White <HS
_subpop_98: Female 40-49 White HSGED
_subpop_99: Female 40-49 White <College
_subpop_100: Female 40-49 White >College
_subpop_101: Female 40-49 Black <HS
_subpop_102: Female 40-49 Black HSGED
_subpop_103: Female 40-49 Black <College
_subpop_104: Female 40-49 Black >College
_subpop_105: Female 40-49 Hispanic <HS
_subpop_106: Female 40-49 Hispanic HSGED
_subpop_107: Female 40-49 Hispanic <College
_subpop_108: Female 40-49 Hispanic >College
_subpop_109: Female 40-49 Other <HS
_subpop_110: Female 40-49 Other HSGED
_subpop_111: Female 40-49 Other <College
_subpop_112: Female 40-49 Other >College
_subpop_113: Female 50-59 White <HS
_subpop_114: Female 50-59 White HSGED
_subpop_115: Female 50-59 White <College
_subpop_116: Female 50-59 White >College
_subpop_117: Female 50-59 Black <HS
_subpop_118: Female 50-59 Black HSGED
_subpop_119: Female 50-59 Black <College
_subpop_120: Female 50-59 Black >College
_subpop_121: Female 50-59 Hispanic <HS
_subpop_122: Female 50-59 Hispanic HSGED
_subpop_123: Female 50-59 Hispanic <College
_subpop_124: Female 50-59 Hispanic >College
_subpop_125: Female 50-59 Other <HS
_subpop_126: Female 50-59 Other HSGED
_subpop_127: Female 50-59 Other <College
_subpop_128: Female 50-59 Other >College
_subpop_129: Female 60-69 White <HS
_subpop_130: Female 60-69 White HSGED
_subpop_131: Female 60-69 White <College
_subpop_132: Female 60-69 White >College
_subpop_133: Female 60-69 Black <HS
_subpop_134: Female 60-69 Black HSGED
_subpop_135: Female 60-69 Black <College
_subpop_136: Female 60-69 Black >College
_subpop_137: Female 60-69 Hispanic <HS
_subpop_138: Female 60-69 Hispanic HSGED
_subpop_139: Female 60-69 Hispanic <College
_subpop_140: Female 60-69 Hispanic >College
_subpop_141: Female 60-69 Other <HS
_subpop_142: Female 60-69 Other HSGED
_subpop_143: Female 60-69 Other <College
_subpop_144: Female 60-69 Other >College
_subpop_145: Female 70+ White <HS
_subpop_146: Female 70+ White HSGED
_subpop_147: Female 70+ White <College
_subpop_148: Female 70+ White >College
_subpop_149: Female 70+ Black <HS
_subpop_150: Female 70+ Black HSGED
_subpop_151: Female 70+ Black <College
_subpop_152: Female 70+ Black >College
_subpop_153: Female 70+ Hispanic <HS
_subpop_154: Female 70+ Hispanic HSGED
_subpop_155: Female 70+ Hispanic <College
_subpop_156: Female 70+ Hispanic >College
_subpop_157: Female 70+ Other <HS
_subpop_158: Female 70+ Other HSGED
_subpop_159: Female 70+ Other <College
_subpop_160: Female 70+ Other >College


I think I would be able to generate new variables and replace them manually to accomplish this recreate these strata. However, I am wondering if I could recreate these 4 demographic variables over 160 observations using loop commands over levelsof these variables to make the coding more efficient. Based on this subpopulation list, I would like for observation 1 to have sex(1), age5cat(0), ethnicity(0), and education(0); observation 2 to have sex(1), age5cat(0), ethnicity(0), and education(1), etc. I am not great with the loop coding and I am having some difficulties. I would appreciate any help if this is possible. This is what I came up with but not sure if this is anywhere near correct on Stata 13. I am not sure how to tie the "replace" commands together such that say a generated education value is tied to generated ethnicity, age5cat, and sex values:


Code:
levelsof sex, local(sexlevels)
gen sex2 = .
levelsof age5cat, local(age5catlevels)
gen age5cat2 = .
levelsof ethnicity, local(ethnicitylevels)
gen ethnicity2 = .
levelsof education, local(educationlevels)
gen education2 = .

    forval i = 1/160 {
        foreach x in varlist sex2 age5cat2 ethnicity2 education2 {
            foreach y of local sexlevels {
                replace sex2=`sexlevels' if `x' ==`y' &
                foreach y of local age5catlevels {
                    replace age5cat2=`age5catlevels' if `x' ==`y' &
                    foreach y of local ethnicitylevels {
                        replace ethnicity2=`ethnicitylevels' if `x' ==`y' &
                        foreach y of local educationlevels {
                            replace education2=`educationlevels' if `x' ==`y'
                            local ++i
                        }
                    }
                }
            }
        }
    }

Which produces:
invalid '2'
r(198);