Dear all,

I am working a study on sick leave in relation to an interested outcome. However, I am struggling in calculation of sick leaves due to overlapping dates. Some individuals have overlapping sick leaves for different diagnosis (received from different registers). For example, id=1 and id= 2 have overlapping leaves for different diagnosis. I would like to calculate total sick leaves regardless of diagnosis (total_days) and overlapping days (overlap). I tried manually for total_days and overlap in the example dataset but it will not be ok for the large dataset with >100,000 observations. So I would like to request your help. The example data set are as follow;

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte id int(from to) byte diagnosis int(total_days overlap)
1 14976 15705 1 6691 2915
1 16437 17531 1 6691 2915
1 17898 18627 1 6691 2915
1 19359 21184 1 6691 2915
1 21550 22400 1 6691 2915
1 15341 17531 2 6691 2915
1 18993 20088 2 6691 2915
1 20820 21914 2 6691 2915
2 14976 18262 1 6327 1459
2 20089 21914 1 6327 1459
2 17532 18992 2 6327 1459
2 21185 22400 2 6327 1459
3 14976 16436 1 5841    0
3 18263 21914 1 5841    0
3 17167 17897 2 5841    0
end
format %tdnn/dd/CCYY from
format %tdnn/dd/CCYY to