Wednesday, March 8, 2023

Looping running sum and total for changing range within grouped variables.

Hi everyone,

I'm relatively new to Stata so please excuse if the explanation of my problem sounds a bit all over the place. I will try my best:

Here is what my data looks like:
Code:
   
rank percent weight
1 1 2000
2 1 3000
3 2 7000
4 2 1200
5 3 3000
6 3 4500
 end
What I want to do is build a loop that for the first group, percent = 1
- generates a variable as the cumulative sum of weight of each observation
- generates a variable as the total sum of weight in that group

And for the second group, percent = 2
- generates a variable as the cumulative sum of each observation from percent = 1 and percent = 2 by the cumulative total of the weights from Perc 1 and 2,
- generates a variable as the total sum of weight from percent = 1 and percent = 2 and so on

In my data the values for percent range from 1-100. However, the loop should only address the first 10 groups, i.e. the first 10 percentile thresholds.

Here is what I tried in Stata, but the results doesn't seem right:
Code:
local k = 1
foreach var of varlist percent_* {
    foreach x in percent_`k' {
        bysort percent_`k' (rank`k'): gen ni_`k' = sum(weight) if `x' == `x'[_n+1]
        bysort percent_`k': egen nall_`k' = total(weight) if `x' == `x'[_n+1]
    }
    local k = `k' + 1
}
end
To give an example of what I'm looking for, eventually it should look like this:
Code:
 
percent weight Ni 1 Nall 1 Ni 2 Nall 2 Ni 3 Nall 3
1 2000 2000 5000 2000 13200 2000 20700
1 3000 5000 5000 5000 13200 5000 20700
2 7000 12000 13200 12000 20700
2 1200 13200 13200 13200 20700
3 3000 16200 20700
3 4500 20700 20700
 end
Any help is much appreciated.

Thank you
Moritz


No comments:

Post a Comment