Dear all users of Statalist,

I have a question about how to build a variable of cumulative fractions. Below is my sample data

Code:
clear
input byte id int year str1 class
1 2015 "a"
1 2015 "a"
1 2015 "a"
1 2015 "b"
1 2015 "b"
1 2015 "c"
1 2016 "a"
1 2016 "a"
1 2016 "b"
1 2016 "b"
1 2016 "c"
1 2016 "d"
1 2016 "d"
1 2017 "a"
1 2017 "a"
1 2017 "a"
1 2017 "d"
1 2017 "d"
1 2017 "e"
1 2018 "b"
1 2018 "e"
1 2018 "f"
end
Below is the result that I want to get from the sample data.

Code:
clear
input byte id int year str1 class byte(cum_appearance cum_total) float cumulative_share
1 2015 "a" 3  6        .5
1 2015 "b" 2  6  .3333333
1 2015 "c" 1  6 .16666667
1 2016 "a" 5 13  .3846154
1 2016 "b" 4 13  .3076923
1 2016 "c" 2 13 .15384616
1 2016 "d" 2 13 .15384616
1 2017 "a" 8 19  .4210526
1 2017 "b" 4 19  .2105263
1 2017 "c" 2 19 .10526316
1 2017 "d" 4 19  .2105263
1 2017 "e" 1 19 .05263158
1 2018 "a" 8 22  .3636364
1 2018 "b" 5 22 .22727273
1 2018 "c" 2 22  .0909091
1 2018 "d" 4 22  .1818182
1 2018 "e" 2 22  .0909091
1 2018 "f" 1 22 .04545455
end
That is, for each "id", my goal is to build a variable "cumulative_share", which is the cumulative fraction of alphabets in "class".

For example, there are 19 observations from 2015 to 2017, and "a" appears 8 times for the same periods. so the "share" of "a" at 2017 must be 8/19.
Following the same way, the share of "a" at 2018 must be 8/22.

The problem is that there could be missing alphabets.
For example, "a" and "c" do not appear in 2018, but their cumulative fraction should be calculated at 2018, to make sure that sum of cumulative_share by each year must be 1.

I have tried to solve the issue by using for-loop, but could not find the solution.
I really thank any sage advice on my stata issue.