https://medium.com/the-stata-guide/c...s-dbe022e7264d
I figured I’d post this on Statalist so it gets more exposure. I’ve seen a thread here expressing interest in how to do this in Stata, so here it is.
This relies on the cleanplots scheme along with Ben Jann’s colrspace and palettes (all available from Github and embedded in the code block below).
For anyone interested in data visualization in Stata, I recommend reading through Asjad’s posts here –
https://medium.com/the-stata-guide
Code:
clear all import delimited using "https://covid.ourworldindata.org/data/ecdc/full_data.csv", clear // net install cleanplots, from("https://tdmize.github.io/data/cleanplots") // net install palettes, replace from("https://raw.githubusercontent.com/benjann/palettes/master/") // net install colrspace, replace from("https://raw.githubusercontent.com/benjann/colrspace/master/") set scheme cleanplots gen date2 = date(date, "YMD") format date2 %tdDD-Mon-yyyy keep if location == "Austria" | /// location == "Belgium" | /// location == "Czech Republic" | /// location == "Denmark" | /// location == "Finland" | /// location == "France" | /// location == "Germany" | /// location == "Greece" | /// location == "Hungary" | /// location == "Italy" | /// location == "Ireland" | /// location == "Netherlands" | /// location == "Norway" | /// location == "Poland" | /// location == "Portugal" | /// location == "Slovenia" | /// location == "Slovak Republic" | /// location == "Spain" | /// location == "Sweden" | /// location == "Switzerland" | /// location == "United Kingdom" ren location country keep country date2 new_cases new_deaths encode country, gen(country2) *** fix some data errors replace new_cases = 0 if new_cases < 0 replace new_deaths = 0 if new_deaths < 0 lab var new_cases "New cases" lab var new_deaths "New deaths" lab var date2 "Date" drop if date2 < 21960 format date2 %tdDD-Mon ***** normalize the cases in a range of 0-1 for each country gen cases_norm = . gen deaths_norm = . levelsof country2, local(levels) foreach x of local levels { summ new_cases if country2==`x' replace cases_norm = new_cases / `r(max)' if country2==`x' summ new_deaths if country2==`x' replace deaths_norm = new_deaths / `r(max)' if country2==`x' } egen tag = tag(country) summ date gen xpoint = `r(min)' if tag==1 gen ypoint=. levelsof country2, local(levels) local items = `r(r)' + 6 foreach x of local levels { summ country2 local newx = `r(max)' + 1 - `x' // reverse the sorting lowess cases_norm date if country2==`newx', bwid(0.05) gen(y`newx') nograph gen ybot`newx' = `newx'/ 2.8 // squish the axis gen ytop`newx' = y`newx' + ybot`newx' colorpalette matplotlib autumn, n(`items') nograph local mygraph `mygraph' rarea ytop`newx' ybot`newx' date , fc("`r(p`newx')'%75") lc(white) lw(thin) || replace ypoint = (ytop`newx' + 1/8) if xpoint!=. & country2==`newx' } summ date local x1 = `r(min)' local x2 = `r(max)' twoway `mygraph' /// (scatter ypoint xpoint, mcolor(white) msize(zero) msymbol(point) mlabel(country) mlabsize(*0.6) mlabcolor(black)), /// xlabel(`x1'(10)`x2', nogrid labsize(vsmall) angle(vertical)) /// ylabel(, nolabels noticks nogrid) yscale(noline) ytitle("") xtitle("") /// legend(off) /// title("{fontface Arial Bold:COVID-19 daily cases in Europe}") /// note("Data sources: Our World in Data, JHU, ECDC. World Bank classifications used for country groups. Each country plot is normalized by its maximum value.", size(tiny)) graph export joyplot.png, replace wid(2000) exit
0 Response to Ridgeline plots (Joy plots) in Stata
Post a Comment