Thanks to Kit Baum, an update to the package gtools is now available for download from SSC. From Stata 13.1 or later, use

Code:
ssc install gtools, replace
See the original announcement here. In short, gtools implements a faster version of several Stata commands, incuding: collapse, reshape, xtile, tabstat, isid, egen, pctile, winsor, contract, levelsof, duplicates, and unique/distinct. For details on the package, see the official documentation. For details on the update, see the release notes. Some highlights:

New commands:
  • greshape long/wide, 4-20x faster than reshape long/wide (additionally accepts any number of i or j variables).
  • greshape gather/spread, similar to long/wide but made to mimic the gather and spread commands in R's tidyr package.
  • gstats tab, 5-40x faster than tabstat (additionally accepts any number of grouping variables).
  • gstats sum, 5-10x faster than sum, detail (regular summarize is not slow, but -detail- is slow to compute all the percentiles).
  • gstats winsor, 10-20x faster than winsor2.
New features:
  • gcollapse, gegen, and gstats tab now allow the following statistics:
    • select# and select-#, to select the #th smallest or largest value
    • rawselect# and rawselect-#, ibid but ignoring weights.
    • cv, to compute the coefficient of variation
    • variance
    • range
  • gtop and glevelsof can save their results in a mata object via mata(name).
  • gtop (gtoplevelsof) can list all the levels via ntop(.), similar to tablist (ntop(-.) lists from least to most common order; option -alpha- lists the top levels in variable order instead of frequency order.
  • greshape allows varlist syntax for long to wide reshapes (though this cannot be combined with @ in the same sub); wide to long matches do not allow varlist syntax, but complex matches can be achieved via the option match(regex), which takes the stubs to be regular expressions (details here).
Some quick benchmarks for the new commands (ran on Stata 15/MP for Unix, 8 cores):

Code:
clear all
ssc install winsor2

program bench
    gettoken timer call: 0,    p(:)
    gettoken colon call: call, p(:)
    cap timer clear `timer'
    timer on `timer'
    `call'
    timer off `timer'
    qui timer list
    c_local r`timer' `=r(t`timer')'
end

set obs 10000000
gen groups = int(runiform() * 1000)
gen smallg = mod(groups, 10)
gen rsort  = rnormal()
gen rvar   = rnormal()
gen ix     = _n
sort rsort

preserve
    rename (rsort rvar) (r1 r2)
    bench 11: greshape long r, i(ix) j(j)
restore, preserve
    rename (rsort rvar) (r1 r2)
    greshape long r, i(ix) j(j) nochecks
    bench 16: greshape wide r, i(ix) j(j)
restore, preserve
    rename (rsort rvar) (r1 r2)
    bench 10: reshape long r, i(ix) j(j)
restore, preserve
    rename (rsort rvar) (r1 r2)
    greshape long r, i(ix) j(j) nochecks
    bench 15: reshape wide r, i(ix) j(j)
restore

bench 21: qui gstats winsor rvar, s(_wg)
bench 20: qui winsor2 groups

bench 26: qui gstats sum rvar
bench 25: qui sum rvar, detail

bench 31: qui gstats tab rvar, by(smallg) s(n mean min max)
bench 30: qui tabstat rvar,    by(smallg) s(n mean min max)

local commands       ///
        reshape_long ///
        reshape_wide ///
        winsor       ///
        sum_detail   ///
        tabstat

local bench_table `"       Versus | Native | gtools | % faster "'
local bench_table `"`bench_table'"' _n(1) `" ------------ | ------ | ------ | -------- "'
forvalues i = 10(5)30 {
    gettoken cmd commands: commands
    local pct      "`:disp %7.2f  100 * (`r`i'' - `r`=`i'+1'') / `r`i'''"
    local dnative  "`:disp %6.2f `r`i'''"
    local dgtools  "`:disp %6.2f `r`=`i'+1'''"
    local cmd      `"`:disp %12s "`cmd'"'"'
    local bench_table `"`bench_table'"' _n(1) `" `cmd' | `dnative' | `dgtools' | `pct'% "'
}
disp _n(1) `"`bench_table'"'
Results

Code:
      Versus | Native | gtools | % faster 
------------ | ------ | ------ | -------- 
reshape_long | 111.63 |   8.21 |   92.65% 
reshape_wide | 127.61 |  16.52 |   87.05% 
      winsor |  28.87 |   1.17 |   95.96% 
  sum_detail |  30.50 |   1.63 |   94.65% 
     tabstat |  32.63 |   1.03 |   96.83%