Problem with prediction after mlogit

Hi,

I am working with a Household survey for 2022, where each row is an individual. I have information on demographics as well as employment variables. I want to simulate each individual's employment status and sector for 2024. For simplicity, assume we are working with 2 sectors. What I have done so far is to create the following categorical variable (I called it sector_all): 1 if employed in sector A, 2 if employed in sector B, 3 if unemployed, and 4 if out of the labor force. Using this as a dependent variable, I run the following multinomial logit regression:
mlogit sector_all gender married children indigenous c.age i.educ i.rural hh_size i.region

which I then used to predict the probability that each individual falls in each of the four categories of sector_all:
predict p1 p2 p3 p4, pr

Now, I would like to use these probabilities to create a simulated version of sector_all, but for 2024. The caveat is that I would like the distribution of workers in 2024 to follow macro growth data in each sector. Lets imagine that sector A is projected to grow 5% in that period, and sector B is projected to decrease in 3%; then I would like that the number of workers in sectors A and B t represent those growth rates.

I am having a lot of trouble to find a way to do this.So far, I have obtained for each person the highest probability across all four categories, and to which sector it corresponds (i.e. the most likely sector they would move to) - see my code below
egen highest_p = rowmax(p1-p4) /*Highest probability*/
forval i = 1/8 {
gen aux`i' = `i' if p`i'== highest_p
}
egen pred_sector_all = rowmax(aux*) /*Predicted sector*/

I have tried generating a random number from a uniform distribution and compare it to this probability, and decide if an individual moves or not based on this comparison, but it never converges to the numbers I need.
gen sector_form2024 = sector_form
gen u = .

loc y_sectorA = 0
loc y_sectorB = 0

while `pred_sectorA' != `y_sectorA' | `pred_sectorB' != `y_sectorB' {

replace sector_form2024 = sector_form
replace u = runiform()

replace sector_form2024 = pred_sector_form if u > highest_p

count if sector_all2024 == 1
loc y_sectorA = r(N)
count if sector_form2024 == 2
loc y_sectorB = r(N)
}
(here pred_sectorA and pred_sectorB are the target number of workers in each corresponding sector after using the growth rates mentioned before)

Any ideas?? Any help would be much much appreciated.

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Problem with prediction after mlogit
Problem with prediction after mlogit

0 Response to Problem with prediction after mlogit

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Problem with prediction after mlogit Problem with prediction after mlogit

Related Posts with Problem with prediction after mlogit

0 Response to Problem with prediction after mlogit

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Problem with prediction after mlogit
Problem with prediction after mlogit