Hello,

I have a two wave balanced panel dataset and I am trying to model change in employment status, so whether one remained unemployed (consistently not in employment), remained employed (consistently in employment), were unemployed and became employed or were employed and became unemployed.

I would like to model the four probabilities and I have both time invariant and time varying predictors. For example, does the education or family size of a person mean they are more likely to be continuously in employment or to be employed in first wave and unemployed in second wave. My observations are nested within participants which are nested in different districts, so I have this hierarchical structure to my data.

I am not sure how to calculate the change score and predict the four probabilities. My outcome is a binary variable for whether a person is employed/unemployed at wave 1 and 2. I have calculated the change in the outcome (“change”) and generated another variable (“empchange”) to see within person change in my data:

Code:
xtset ID year 
       panel variable:  ID (strongly balanced)
        time variable:  year, 2012 to 2018, but with gaps
                delta:  1 unit

. bysort ID (year): gen change = cremp2-cremp2[_n-1]
(5,588 missing values generated)

. ta change

     change |      Freq.     Percent        Cum.
------------+-----------------------------------
         -1 |        664       11.88       11.88
          0 |      3,641       65.16       77.04
          1 |      1,283       22.96      100.00
------------+-----------------------------------
      Total |      5,588      100.00

gen empchange=.
recode empchange .=0 if change==-1
recode empchange .=1 if change==1
recode empchange .=2 if change==0 & cremp2==0
recode empchange .=3 if change==0 & cremp2==1
label define chnge 0 "Employed - unemployed" 1 "Unemployed - employed" 2 "Consistently unemployed" 3 "Consistently employed"
label values empchange chnge
My “change” variable then represents whether one’s employment status changed (-1, 1) or stayed the same (0) measured at wave 2. I also have my predictors which are measured at wave 1 and 2. I am not sure which modelling strategy would be suitable or whether this is the correct way to calculate the change score. I have though about multilevel mixed effects models, but I am not sure if that’s possible with a nominal outcome variable?

An example of my data:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input double ID int year byte(district educ2 hhsize) float(change empchange) byte cremp2
601004004 2012  1 1 6  . . 0
601004004 2018  1 1 7  0 2 0
601004603 2012  1 0 5  . . 0
601004603 2018  1 1 5  0 2 0
601005902 2012  1 3 4  . . 1
601005902 2018  1 3 4  0 3 1
601012406 2012  1 2 4  . . 0
601012406 2018  1 2 4  0 2 0
601020204 2012  1 3 4  . . 0
601020204 2018  1 3 3  0 2 0
601029206 2012  1 0 4  . . 0
601029206 2018  1 0 2  0 2 0
601032102 2012  1 3 2  . . 1
601032102 2018  1 3 2 -1 0 0
601032309 2012  1 2 5  . . 0
601032309 2018  1 2 6  0 2 0
601037505 2012 21 3 4  . . 0
601037505 2018 21 3 5  0 2 0
601038302 2012  1 3 3  . . 0
601038302 2018  1 3 5  0 2 0
601045003 2012  1 2 4  . . 0
601045003 2018  1 2 5  0 2 0
601050804 2012  1 3 6  . . 0
601050804 2018  1 3 7  0 2 0
601051404 2012  1 2 5  . . 0
601051404 2018  1 2 6  0 2 0
601054402 2012  1 2 4  . . 0
601054402 2018  1 2 5  0 2 0
601055607 2012  1 2 2  . . 0
601055607 2018  1 2 1  1 1 1
601056302 2012  1 2 4  . . 0
601056302 2018  1 2 5  0 2 0
601058302 2012  1 3 4  . . 1
601058302 2018  1 3 5  0 3 1
601058602 2012  1 2 5  . . 0
601058602 2018  1 2 5  0 2 0
601060002 2012  1 0 5  . . 0
601060002 2018  1 0 6  1 1 1
601060202 2012  1 2 4  . . 0
601060202 2018  1 2 4  0 2 0
601060302 2012  1 1 4  . . 0
601060302 2018  1 1 4  0 2 0
601061602 2012  1 3 2  . . 0
601061602 2018  1 3 2  0 2 0
601062002 2012  1 2 5  . . 0
601062002 2018  1 2 5  0 2 0
601062602 2012  1 0 5  . . 0
601062602 2018  1 1 6  0 2 0
601063102 2012  1 0 4  . . 0
601063102 2018  1 2 5  1 1 1
601067002 2012  1 3 4  . . 1
601067002 2018  1 3 4 -1 0 0
601072002 2012  1 2 6  . . 0
601072002 2018  1 2 6  1 1 1
601072702 2012  1 0 3  . . 0
601072702 2018  1 1 4  0 2 0
601072902 2012  1 1 4  . . 0
601072902 2018  1 1 4  1 1 1
601073102 2012  1 1 4  . . 0
601073102 2018  1 1 4  0 2 0
601074502 2012  1 3 4  . . 0
601074502 2018  1 3 4  1 1 1
601075902 2012  1 2 4  . . 0
601075902 2018  1 2 5  0 2 0
601077402 2012  1 2 6  . . 0
601077402 2018  1 2 7  0 2 0
601078802 2012  1 3 5  . . 1
601078802 2018  1 1 5  0 3 1
601080302 2012  1 2 5  . . 0
601080302 2018  1 2 5  1 1 1
601082202 2012  1 2 4  . . 0
601082202 2018  1 2 4  0 2 0
601083402 2012  1 2 5  . . 0
601083402 2018  1 3 5  0 2 0
601085202 2012  1 2 4  . . 0
601085202 2018  1 2 5  0 2 0
601085602 2012  1 2 5  . . 0
601085602 2018  1 2 5  0 2 0
601085703 2012  1 2 4  . . 0
601085703 2018  1 2 6  0 2 0
601086402 2012  1 0 4  . . 0
601086402 2018  1 0 6  0 2 0
601088002 2012  1 3 4  . . 0
601088002 2018  1 3 4  0 2 0
601088202 2012  1 2 5  . . 0
601088202 2018  1 2 5  0 2 0
601088305 2012  1 2 5  . . 0
601088305 2018  1 2 2  0 2 0
601088306 2012  1 2 5  . . 0
601088306 2018  1 2 6  0 2 0
601089002 2012  1 2 4  . . 0
601089002 2018  1 2 5  0 2 0
601089303 2012  1 2 4  . . 1
601089303 2018  1 3 4  0 3 1
601090502 2012  1 3 5  . . 1
601090502 2018  1 3 4  0 3 1
601090602 2012  1 2 5  . . 0
601090602 2018  1 2 5  0 2 0
601091602 2012  1 2 6  . . 0
601091602 2018  1 2 5  0 2 0
end
label values year code
label values district Lgov
label def Lgov 1 "Cairo", modify
label def Lgov 21 "Giza", modify
label values educ2 educ
label def educ 0 "Illiterate", modify
label def educ 1 "Less than vocational secondary", modify
label def educ 2 "Vocational secondary", modify
label def educ 3 "University & post-grad", modify
label values empchange chnge
label def chnge 0 "Employed - unemployed", modify
label def chnge 1 "Unemployed - employed", modify
label def chnge 2 "Consistently unemployed", modify
label def chnge 3 "Consistently employed", modify
label values cremp2 emp
label def emp 0 "No", modify
label def emp 1 "Yes", modify