I have a question about how to build a variable of cumulative fractions. Below is my sample data
Code:
clear input byte id int year str1 class 1 2015 "a" 1 2015 "a" 1 2015 "a" 1 2015 "b" 1 2015 "b" 1 2015 "c" 1 2016 "a" 1 2016 "a" 1 2016 "b" 1 2016 "b" 1 2016 "c" 1 2016 "d" 1 2016 "d" 1 2017 "a" 1 2017 "a" 1 2017 "a" 1 2017 "d" 1 2017 "d" 1 2017 "e" 1 2018 "b" 1 2018 "e" 1 2018 "f" end
Code:
clear input byte id int year str1 class byte(cum_appearance cum_total) float cumulative_share 1 2015 "a" 3 6 .5 1 2015 "b" 2 6 .3333333 1 2015 "c" 1 6 .16666667 1 2016 "a" 5 13 .3846154 1 2016 "b" 4 13 .3076923 1 2016 "c" 2 13 .15384616 1 2016 "d" 2 13 .15384616 1 2017 "a" 8 19 .4210526 1 2017 "b" 4 19 .2105263 1 2017 "c" 2 19 .10526316 1 2017 "d" 4 19 .2105263 1 2017 "e" 1 19 .05263158 1 2018 "a" 8 22 .3636364 1 2018 "b" 5 22 .22727273 1 2018 "c" 2 22 .0909091 1 2018 "d" 4 22 .1818182 1 2018 "e" 2 22 .0909091 1 2018 "f" 1 22 .04545455 end
For example, there are 19 observations from 2015 to 2017, and "a" appears 8 times for the same periods. so the "share" of "a" at 2017 must be 8/19.
Following the same way, the share of "a" at 2018 must be 8/22.
The problem is that there could be missing alphabets.
For example, "a" and "c" do not appear in 2018, but their cumulative fraction should be calculated at 2018, to make sure that sum of cumulative_share by each year must be 1.
I have tried to solve the issue by using for-loop, but could not find the solution.
I really thank any sage advice on my stata issue.
0 Response to Building a fraction variable cumulatively
Post a Comment