Hi,
I am working with a Household survey for 2022, where each row is an individual. I have information on demographics as well as employment variables. I want to simulate each individual's employment status and sector for 2024. For simplicity, assume we are working with 2 sectors. What I have done so far is to create the following categorical variable (I called it sector_all): 1 if employed in sector A, 2 if employed in sector B, 3 if unemployed, and 4 if out of the labor force. Using this as a dependent variable, I run the following multinomial logit regression:
mlogit sector_all gender married children indigenous c.age i.educ i.rural hh_size i.region
which I then used to predict the probability that each individual falls in each of the four categories of sector_all:
predict p1 p2 p3 p4, pr
Now, I would like to use these probabilities to create a simulated version of sector_all, but for 2024. The caveat is that I would like the distribution of workers in 2024 to follow macro growth data in each sector. Lets imagine that sector A is projected to grow 5% in that period, and sector B is projected to decrease in 3%; then I would like that the number of workers in sectors A and B t represent those growth rates.
I am having a lot of trouble to find a way to do this.So far, I have obtained for each person the highest probability across all four categories, and to which sector it corresponds (i.e. the most likely sector they would move to) - see my code below
egen highest_p = rowmax(p1-p4) /*Highest probability*/
forval i = 1/8 {
gen aux`i' = `i' if p`i'== highest_p
}
egen pred_sector_all = rowmax(aux*) /*Predicted sector*/
I have tried generating a random number from a uniform distribution and compare it to this probability, and decide if an individual moves or not based on this comparison, but it never converges to the numbers I need.
gen sector_form2024 = sector_form
gen u = .
loc y_sectorA = 0
loc y_sectorB = 0
while `pred_sectorA' != `y_sectorA' | `pred_sectorB' != `y_sectorB' {
replace sector_form2024 = sector_form
replace u = runiform()
replace sector_form2024 = pred_sector_form if u > highest_p
count if sector_all2024 == 1
loc y_sectorA = r(N)
count if sector_form2024 == 2
loc y_sectorB = r(N)
}
(here pred_sectorA and pred_sectorB are the target number of workers in each corresponding sector after using the growth rates mentioned before)
Any ideas?? Any help would be much much appreciated.
Related Posts with Problem with prediction after mlogit
data structure transformationHi How can the left structure be transformed into the right one? The 'school' indicates school id.…
Converting a code from Stata to RHello guys, I'm new to both Satata and R. I'm trying to do a regression to find the causal effect be…
Poisson regression formulaHello everyone, I am trying to write this Code: xtpoisson patents cl.ln_xrdintensity##cl.Numberof…
Negative predicted values after xtnbreg on nested count dataI am analyzing how many comments a social media picture of a women gets based on how good looking th…
Dividing a column by anotherHello everyone, I use the command below in Stata 17 to calculate the total income and total tax paid…
Subscribe to:
Post Comments (Atom)
0 Response to Problem with prediction after mlogit
Post a Comment