I want to graph a number of financial variables, such as total household assets, and compare the values between a few dichotomous and categorical variables (e.g. race, religion, etc). I graph using a 95% confidence band for each to show the range of values and overlay that with the average of each of the same variables. The financial variable is on the y axis and age is on the x axis so I can observe the change in these values over the lifecycle (by group).
As you can see my dataset contains extreme values and I am looking for options on how best to 'deal' with these for graphing purposes - without excluding any of the observations as I do not want to artificially affect the mean values.
Array
Code:
tw (lpolyci totasset hgage1 if group == 1 & wave == 2, bwidth(3) lc("230 76 138") lw(medthick) ciplot(rarea) acolor("230 76 138%30") alw(5) level(95)) /// (lpolyci totasset hgage1 if group == 2 & wave == 2, bwidth(3) lc("25 154 222") lw(medthick) ciplot(rarea) acolor("25 154 222%30") alw(none) level(95)) /// (connect totassetave hgage1 if group == 1 & wave == 2, lc("230 76 138%70") lwidth(medthin) lpattern(shortdash) m(oh) mlw(vthin) mc("230 76 138%90")) /// (connect totassetave hgage1 if group == 2 & wave == 2, lc("25 154 222%10") lwidth(medthin) lpattern(shortdash) m(oh) mlw(vthin) mc("25 154 222%90")), /// title("Wave 2", size(medsmall) position(11) justification(right)) /// legend(region(lstyle(none)) order(2 "type A" 4 "type B") col(2) pos(0) ring(1) bplace(ne) rowgap(.1) colgap(1) size(small) color(none) region(fcolor(none))) /// angle(h) ytitle("Total assets", size(small)) xtitle("Age", size(small)) /// xla(20(10)100, format(%8.0fc) labsize(vsmall)) xtick(20(10)100) xmtick(15(10)95) /// yla(0(400000)2000000, format(%10.0fc) labsize(vsmall)) ytick(0(400000)2000000) ymtick(200000(400000)2000000, grid nogmin gex glc(gs12) glp(dot) glw(medthin)) ytick(0(.1).5) /// plotr(margin(zero) lw(medthin)) scheme(burd) name("Fig4", replace) scale(1.2)
One option is to take the natural log of these values (after applying this change to the first five lines of code) I obtain this graph - still with extreme values. (Note - there are no negative values). I believe using -yscale(log)- and/or -ylabels- will help here but I have not yet worked out how to code such that the values are in $ terms. Any suggestions here appreciated.
Array
Stata v.15.1. I am using panel data. This post has its roots at #11-#15 here https://www.statalist.org/forums/for...-loop-question - though 'morphed' from the original thread title hence reposting.
0 Response to Graphing variables with extreme values
Post a Comment