Dear Statalisters,
I would love it if someone could help me solve the issue below. I try to give as much information as possible and proceed step by step, but please do let me know if there is something else that I should be providing (it is my first post!).
First, I am trying to generate a graph that has in the y-axis the attendance (in percentages) to various training sessions, and in the x-axis the training sessions themselves (all trainings, T1, T2, T3, T4). I do this using the code below:
twoway connected avg_percentattendedt T if T==0 || rcap hi_percentattendedt lo_percentattendedt T if T==0 /// * all trainings
|| connected avg_percentattendedt T if T==1 || rcap hi_percentattendedt lo_percentattendedt T if T==1 /// * T1
|| connected avg_percentattendedt T if T==2 || rcap hi_percentattendedt lo_percentattendedt T if T==2 /// *T2
|| connected avg_percentattendedt T if T==3 || rcap hi_percentattendedt lo_percentattendedt T if T==3 /// *T3
|| connected avg_percentattendedt T if T==4 || rcap hi_percentattendedt lo_percentattendedt T if T==4 /// *T4
, legend(order( 1 "All T mean" 2 "All T hi/low" 3 "T1 mean" 4 "T1 hi/low" 5 "T2 mean" 6 "T2 hi/low" 7 "T3 mean" 8 "T3 hi/low" 9 "T4 mean" 10 "T4 hi/low") pos(6) rows(5)) xlab(0 "All" 1 "T1" 2 "T2" 3 "T3" 4 "T4") ///
ytitle("%", height(10)) ylabel(55(5)80) xtitle("Treatment")
However, the attendants to the training sessions can be of 3 different types (say mg_level 1, mg_level 2, mg_level 3). I would like to reproduce the same graph as above with the distinction that for each point in the x-axis (i.e. each training) I would like the mean and variation for the three groups.
The data is initially in wide format and I have the percentage attendance variables without making distinction across groups. I proceed to create the variables by managerial level with the code below. In the code, I also collapse the data and reshape to long format as to end up with a dataset consisting of three observations (one for each managerial level), and variables "T avg_percentattendedt0 hi_percentattendedt0 lo_percentattendedt0 avg_percentattendedt1 hi_percentattendedt1 lo_percentattendedt1 avg_percentattendedt2 hi_percentattendedt2 lo_percentattendedt2 avg_percentattendedt3 hi_percentattendedt3 lo_percentattendedt3 avg_percentattendedt4 hi_percentattendedt4 lo_percentattendedt4". T is equal to 1,2,3 for obs 1, 2, and 3 respectively, and distinguishes between the groups.
global Var percentattendedt0 percentattendedt1 percentattendedt2 percentattendedt3 percentattendedt4
foreach y of varlist $Var {
forval i = 1/3 {
if `i' == 1 {
su `y' if keyattendant == 1 & mg_level == `i'
scalar mean_`y'`i' = r(mean)
scalar n_`y'`i' = r(N)
scalar sd_`y'`i' = r(sd)
egen avg_`y'`i' = mean(`y') if keyattendant == 1 & mg_level == `i'
gen hi_`y'`i' = avg_`y'`i' + invttail(n_`y'`i'-1,0.025)*(sd_`y'`i' / sqrt(n_`y'`i'))
gen lo_`y'`i' = avg_`y'`i' - invttail(n_`y'`i'-1,0.025)*(sd_`y'`i' / sqrt(n_`y'`i'))
}
if `i' == 2 {
su `y' if keyattendant == 1 & mg_level == `i'
scalar mean_`y'`i' = r(mean)
scalar n_`y'`i' = r(N)
scalar sd_`y'`i' = r(sd)
egen avg_`y'`i' = mean(`y') if keyattendant == 1 & mg_level == `i'
gen hi_`y'`i' = avg_`y'`i' + invttail(n_`y'`i'-1,0.025)*(sd_`y'`i' / sqrt(n_`y'`i'))
gen lo_`y'`i' = avg_`y'`i' - invttail(n_`y'`i'-1,0.025)*(sd_`y'`i' / sqrt(n_`y'`i'))
}
if `i' == 3 {
su `y' if mg_level == `i'
scalar mean_`y'`i' = r(mean)
scalar n_`y'`i' = r(N)
scalar sd_`y'`i' = r(sd)
egen avg_`y'`i' = mean(`y') if mg_level == `i'
gen hi_`y'`i' = avg_`y'`i' + invttail(n_`y'`i'-1,0.025)*(sd_`y'`i' / sqrt(n_`y'`i'))
gen lo_`y'`i' = avg_`y'`i' - invttail(n_`y'`i'-1,0.025)*(sd_`y'`i' / sqrt(n_`y'`i'))
}
}
}
collapse (mean) avg_percentattendedt01 hi_percentattendedt01 lo_percentattendedt01 ///
avg_percentattendedt02 hi_percentattendedt02 lo_percentattendedt02 ///
avg_percentattendedt03 hi_percentattendedt03 lo_percentattendedt03 ///
avg_percentattendedt11 hi_percentattendedt11 lo_percentattendedt11 ///
avg_percentattendedt12 hi_percentattendedt12 lo_percentattendedt12 ///
avg_percentattendedt13 hi_percentattendedt13 lo_percentattendedt13 ///
avg_percentattendedt21 hi_percentattendedt21 lo_percentattendedt21 ///
avg_percentattendedt22 hi_percentattendedt22 lo_percentattendedt22 ///
avg_percentattendedt23 hi_percentattendedt23 lo_percentattendedt23 ///
avg_percentattendedt31 hi_percentattendedt31 lo_percentattendedt31 ///
avg_percentattendedt32 hi_percentattendedt32 lo_percentattendedt32 ///
avg_percentattendedt33 hi_percentattendedt33 lo_percentattendedt33 ///
avg_percentattendedt41 hi_percentattendedt41 lo_percentattendedt41 ///
avg_percentattendedt42 hi_percentattendedt42 lo_percentattendedt42 ///
avg_percentattendedt43 hi_percentattendedt43 lo_percentattendedt43
gen A = 1
reshape long avg_percentattendedt0 avg_percentattendedt1 avg_percentattendedt2 avg_percentattendedt3 avg_percentattendedt4 ///
hi_percentattendedt0 hi_percentattendedt1 hi_percentattendedt2 hi_percentattendedt3 hi_percentattendedt4 ///
lo_percentattendedt0 lo_percentattendedt1 lo_percentattendedt2 lo_percentattendedt3 lo_percentattendedt4, i(A) j(T)
My best attempt to create the graph I need has taken me as far as this (see below). Unless I have misunderstood, the twoway command does not admit the over option, which I think is a main reason why I am getting stuck.
twoway connected avg_percentattendedt0 T || rcap hi_percentattendedt0 lo_percentattendedt0 T ///
|| connected avg_percentattendedt1 T || rcap hi_percentattendedt1 lo_percentattendedt1 T ///
|| connected avg_percentattendedt2 T || rcap hi_percentattendedt2 lo_percentattendedt2 T ///
|| connected avg_percentattendedt3 T || rcap hi_percentattendedt3 lo_percentattendedt3 T ///
|| connected avg_percentattendedt4 T || rcap hi_percentattendedt4 lo_percentattendedt4 T
, legend(order( 1 "All T mean" 2 "All T hi/low" 3 "T1 mean" 4 "T1 hi/low" 5 "T2 mean" 6 "T2 hi/low" 7 "T3 mean" 8 "T3 hi/low" 9 "T4 mean" 10 "T4 hi/low") pos(6) rows(5)) xlab(0 "All" 1 "T1" 2 "T2" 3 "T3" 4 "T4") ///
ytitle("%", height(10)) ylabel(55(5)80) xtitle("Treatment")
Related Posts with Graph with mean attendance and variation in attendance to several training sessions, split by groups
Interpetation of pstestCan anyone help me how to interpret psmatch2 and pstest? I wanted to do probit model ATE test using …
Using loops in MatrixHi all, I am trying to create a composite index on connectivity, by region (e.g. africa, european u…
How to find variable1 that contain words like "apple" or "pear" with keywords typea... and replace them with the text "selected fruit"?Hi all. How to find variable1 that contain words like "apple" or "pear" or "orange" with keywords t…
Event study moving average outside event windowThe following panel data provides information about the firm event PERMCO_ED which is unique. DATE i…
control row / colum percent display in svy jackknifewhen i do this: svy jackknife, subpop (if charflag==2 & SCHLEV_3CAT==2 & teach_pe_di==2 ): …
Subscribe to:
Post Comments (Atom)
0 Response to Graph with mean attendance and variation in attendance to several training sessions, split by groups
Post a Comment