Hi everyone,
I am looking to do two things with my data that I am struggling with setting up the equation for - please advise. First, I want to estimate the total proportion (not the mean proportion) by group (of schools), year, outcome, and treatment (=0 or =1). Then, from that proportion, I would like to calculate the average for all years in that proportion.
This is my code:
generate outcome_total=.
replace outcome_total = sum(outcome) / sum(number_of_participants)
sum outcome_total if treatment==1 | year==2001
sum outcome_total if treatment==0 | year==2001
sum outcome_total if treatment==1 | year==2002
sum outcome_total if treatment==0 | year==2002
sum outcome_total if treatment==1 | year==2003
sum outcome_total if treatment==0 | year==2003
mean outcome_total if treatment==1
mean outcome_total if treatment==0
The issue I'm running into is whether I'm supposed to "sum" of "outcome_total" per year and then "mean" it to get the total average proportion? I wanted to get the aggregate of outcome by treatment and year and then the mean of all the years. But without the "sum" per year, I get different numbers for the different schools in each year because I don't know how to group the schools together. I would appreciate any help in grouping the schools by year, outcome, treatment and getting the aggregate of that and then take the mean by total years. Please let me know if that makes sense. Many thanks!
This is my dataset:
clear
input float(year school treatment outcome number_of_participants)
6 1 1 1
7 1 1 1
8 1 1 1
9 1 1 1
10 1 1 1
6 1 1 1
7 1 1 1
8 1 1 1
9 1 1 1
10 1 1 1
6 1 1 1
7 1 1 1
8 1 1 1
9 1 1 1
10 1 1 1
1 1 164 183
2 1 195 203
3 1 208 214
4 1 314 209
5 1 247 195
6 1 57 71
7 1 51 87
8 1 47 57
9 1 36 23
Related Posts with Aggregating variable by year and proportions
Generating a variable that is the difference between progressive different values of another variableI want to generate a new variable, say called l_diff, that starts producing values in the row where …
cgmreg - storing the number of observationsI have a problem in pubIishing the estimation tables using esttab as I am unable to store the statis…
How to drop sequence of observations when the observations are zero sequentlyI am now cleaning my firm-level data. I have panel data which contains 19,150 observations/year for …
Correlation values over varying periods in time-series dataDear Stata users, I have a time-series dataset of indices with continuously compounded monthly retu…
How to put all graphs in a same pdf file?Hello. I would like to know if it is possible to make Stata combine all pdf generated by the loop b…
Subscribe to:
Post Comments (Atom)
0 Response to Aggregating variable by year and proportions
Post a Comment