Software Tools

Our faculty have been active both in developing statistical methodologies and in implementing software applications based on those methodologies. The following are some examples.

Bootstrap

The bootstrap method estimates the standard error of a statistic by repeatedly drawing "bootstrap samples" from the original data, re-evaluating the statistic for each bootstrap sample, and estimating the standard error of the original statistic by the observed variability in the bootstrap statistic values. Bootstrap technique was invented by Bradley Efron (1979, 1981, 1982) and further developed by Efron and Tibshirani (1993). "Bootstrap" means that one available sample gives rise to many others by resampling (a concept reminiscent of pulling yourself up by your own bootstraps). Implementations now exist in many software packages including a very flexible one in the boot package in R.

CART

CART stands for Classification and Regression Trees. As the name implies, the CART methodology involves using binary trees for tackling classification and regression problems. The canonical reference for the methodology and software is the book by Breiman, Friedman, Olshen and Stone, Classification and Regression Trees from Wadsworth Press.

GAM

GAM stands for Generalized Additive Models. Developed by Hastie and Tibshirani, GAM is a regression model where the linear form of the predictors is replaced by a sum of smooth functions of the predictors. It is implemented in the R package for GAM. For more details, see the paper in Statistical Science.

Lasso

Lasso is a shrinkage and selection method for linear regression. It minimizes the usual sum of squared errors, with a bound on the sum of the absolute values of the coefficients. It has connections to soft-thresholding of wavelet coefficients, forward stagewise regression, and boosting methods. It was described in a 1996 JRSS B paper by Robert Tibshirani. Visit the Lasso Project site.

WaveLab

WaveLab is a collection of Matlab functions that have been used by David Donoho and collaborators to implement a variety of computational algorithms related to wavelet analysis. It is made available in the interest of reproducible research, enabling others to understand and reproduce their work. Versions are provided for Macintosh, UNIX and Windows machines. Read about WaveLab.