Hello everybody.

I want to track when a group of individuals changes from one employer to another due to privatization, ultimately testing how privatization affects the group by comparing them to a control group. I know when a change occurs (the year and the month) and the involved public and private employers. The following variables are essential to my current issue: ID_person (numeric), ID_employer (numeric), year (numeric), and month (numeric). The dataset is in a long format, and after reducing the dataset to only individuals involved in a particular case of privatization, the dataset has around 400.000 observations.

I have made up the following example (I cannot provide a simple example as my data is stored on a confidential server):

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float(id_person id_employer year month)
1 100 2013  1
1 100 2013  2
1 100 2013  3
1 100 2013  4
1 100 2013  5
1 100 2013  6
1 100 2013  7
1 100 2013  8
1 100 2013  9
1 100 2013 10
1 100 2013 11
1 100 2013 12
1 200 2014  1
1 200 2014  2
1 200 2014  3
1 200 2014  4
1 200 2014  5
1 200 2014  6
1 200 2014  7
1 200 2014  8
1 200 2014  9
1 200 2014 10
1 200 2014 11
1 200 2014 12
2 100 2013  1
2 100 2013  2
2 100 2013  3
2 100 2013  4
2 100 2013  5
2 100 2013  6
2 100 2013  7
2 100 2013  8
2 100 2013  9
2 100 2013 10
2 100 2013 11
2 100 2013 12
2 200 2014  1
2 200 2014  2
2 200 2014  3
2 200 2014  4
2 200 2014  5
2 200 2014  6
2 200 2014  7
2 200 2014  8
2 200 2014  9
2 200 2014 10
2 200 2014 11
2 200 2014 12
end
So in the above example, two employees changed from employer “100” to employer “200” on Jan 1, 2014.

I have tried different things like collapsing the dataset by ID and EMPLOYER, followed by duplicating the observations (although that takes a few more variables than shown above). This approach allows me to count the number of months an individual has an employer in 2013 and 2014. However, I do not know whether the employees shift from 100 to 200 or the other way around. I could browse all the remaining observations and assess them separately, but that is error-prone and time-consuming.

I know that there are many helpful Stata commands for longitudinal data analysis, but not precisely how to deal with this issue. Any help will be great!