Hi everybody

I am working with panel data to test how employees are affected when they are insourced from a private employer to a public employer. I use outcomes such as salary and working hours. The dataset contains 260.700.000 observations and has every month since 2008, i.e. 2008m1, 2008m2....2020m12). The data is stored on a confidential serve, so I have made a simple (very simple) example dataset for illustration:
Code:
clear
input float(id ym company_id share_full_time_work) byte duplicates
1 648 1     1 1
1 648 2    .2 1
1 649 1     1 1
1 649 2    .4 1
2 682 3   .05 1
2 682 4    .2 1
3 651 5     1 2
3 651 6 .0517 2
3 651 7   .19 2
end
format %tm ym
My issue is that I cannot set up my panel data with xtset id ym. This is because some employees have multiple employers per month (“employer” is based on payouts (salary, employee benefits, etc.)). I want to use this information (i.e., primary employer, secondary employer, tertiary employer, etc.) in my analysis, so I cannot, e.g., collapse each month and only keep the primary employer.
What is the simplest way to deal with this issue, e.g., generate primary, secondary, tertiary employers? In some cases, I have more than 10 employers per individual in a month. I would be willing to keep only the three dominat employers per individual per month if necessary.

Best,
Gustav