Code:
ssc install gtools, replace
New commands:
- greshape long/wide, 4-20x faster than reshape long/wide (additionally accepts any number of i or j variables).
- greshape gather/spread, similar to long/wide but made to mimic the gather and spread commands in R's tidyr package.
- gstats tab, 5-40x faster than tabstat (additionally accepts any number of grouping variables).
- gstats sum, 5-10x faster than sum, detail (regular summarize is not slow, but -detail- is slow to compute all the percentiles).
- gstats winsor, 10-20x faster than winsor2.
- gcollapse, gegen, and gstats tab now allow the following statistics:
- select# and select-#, to select the #th smallest or largest value
- rawselect# and rawselect-#, ibid but ignoring weights.
- cv, to compute the coefficient of variation
- variance
- range
- gtop and glevelsof can save their results in a mata object via mata(name).
- gtop (gtoplevelsof) can list all the levels via ntop(.), similar to tablist (ntop(-.) lists from least to most common order; option -alpha- lists the top levels in variable order instead of frequency order.
- greshape allows varlist syntax for long to wide reshapes (though this cannot be combined with @ in the same sub); wide to long matches do not allow varlist syntax, but complex matches can be achieved via the option match(regex), which takes the stubs to be regular expressions (details here).
Code:
clear all ssc install winsor2 program bench gettoken timer call: 0, p(:) gettoken colon call: call, p(:) cap timer clear `timer' timer on `timer' `call' timer off `timer' qui timer list c_local r`timer' `=r(t`timer')' end set obs 10000000 gen groups = int(runiform() * 1000) gen smallg = mod(groups, 10) gen rsort = rnormal() gen rvar = rnormal() gen ix = _n sort rsort preserve rename (rsort rvar) (r1 r2) bench 11: greshape long r, i(ix) j(j) restore, preserve rename (rsort rvar) (r1 r2) greshape long r, i(ix) j(j) nochecks bench 16: greshape wide r, i(ix) j(j) restore, preserve rename (rsort rvar) (r1 r2) bench 10: reshape long r, i(ix) j(j) restore, preserve rename (rsort rvar) (r1 r2) greshape long r, i(ix) j(j) nochecks bench 15: reshape wide r, i(ix) j(j) restore bench 21: qui gstats winsor rvar, s(_wg) bench 20: qui winsor2 groups bench 26: qui gstats sum rvar bench 25: qui sum rvar, detail bench 31: qui gstats tab rvar, by(smallg) s(n mean min max) bench 30: qui tabstat rvar, by(smallg) s(n mean min max) local commands /// reshape_long /// reshape_wide /// winsor /// sum_detail /// tabstat local bench_table `" Versus | Native | gtools | % faster "' local bench_table `"`bench_table'"' _n(1) `" ------------ | ------ | ------ | -------- "' forvalues i = 10(5)30 { gettoken cmd commands: commands local pct "`:disp %7.2f 100 * (`r`i'' - `r`=`i'+1'') / `r`i'''" local dnative "`:disp %6.2f `r`i'''" local dgtools "`:disp %6.2f `r`=`i'+1'''" local cmd `"`:disp %12s "`cmd'"'"' local bench_table `"`bench_table'"' _n(1) `" `cmd' | `dnative' | `dgtools' | `pct'% "' } disp _n(1) `"`bench_table'"'
Code:
Versus | Native | gtools | % faster ------------ | ------ | ------ | -------- reshape_long | 111.63 | 8.21 | 92.65% reshape_wide | 127.61 | 16.52 | 87.05% winsor | 28.87 | 1.17 | 95.96% sum_detail | 30.50 | 1.63 | 94.65% tabstat | 32.63 | 1.03 | 96.83%
0 Response to Gtools update available on SSC: greshape, gstats winsor, gstats tab, and more!
Post a Comment