My Publications
Books
by Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani and Jonathan Taylor (July 2023). This book (ISLP) differs from the R book (ISLR2) in that the labs at the end of each chapter are implemented in Python.
Book Homepage and Resourcesby Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (August 2021) 3 new chapters (+179 pages), including Deep Learning
Book Homepage and Resourcesby Bradley Efron and Trevor Hastie (August 2016)
Book Homepageby Trevor Hastie, Robert Tibshirani and Martin Wzainwright (May 2015)
Book Homepageby Trevor Hastie, Robert Tibshirani, and Jerome Friedman (2009)
Book Homepageby Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (June 2013)
Book Homepageby Trevor Hastie, Robert Tibshirani, and Jerome Friedman (2001)
Book HomepagePapers
The research reported here was partially supported by grants from the National Science Foundation and the National Institutes of Health.

2025
-
F. Ferretti, J. Jenrette, S. Moro, C. Butner, E. Fox, S. H. D. Haddock, S. J. Jorgensen, T. Hastie, F. Micheli From Data Deficient to Big Data in Shark Conservation. Citizen science is increasingly harnessed worldwide to gather data otherwise requiring a prohibitive investment of funding and time. Crowdsourcing and social network mining can be used to quickly and cost-effectively fill major gaps in knowledge necessary to protect endangered populations. Sharks are among the most endangered and data-poor vertebrates in the ocean. Fish and Fisheries, 2025.
-
Emmanuel Candes, Trevor Hastie, Ked Hogan, Ronal Kahn, Robert Luo and Asher Spector Thematic Investing: a Risk-based Perspective. We propose a risk-based definition of a thematic basket and focus on themes that involve significant transient correlations of residual returns. We present a bootstrapping style approach to determine the statistical significance of the average pairwise correlation among stocks in a thematic basket. The thematic baskets with statistically significant average pairwise correlation will have risk levels above predictions. SSRN July 11, 2025.
-
Sourav Chatterjee, Trevor Hastie and Rob Tibshirani. Univariate Guided Sparse Regression. In this paper, we introduce "UniLasso" -- a novel statistical method for sparse regression. This two-stage approach preserves the signs of the univariate coefficients and leverages their magnitude. Both of these properties are attractive for stability and interpretation of the model. The link above points to the published paper at Harvard Data Science Review (2025), followed by ten contributed discussions, and our rejoinder.

2024
-
James Yang and Trevor Hastie. A Fast Coordinate Descent Method for High-Dimensional Non-Negative Least Squares using a Unified Sparse Regression Framework. We develop theoretical results that establish a connection across various regression methods such as non-negative least squares, bounded variable least squares, simplex constrained least squares, and lasso. In particular, we show in general that a polyhedron constrained least squares problem admits a locally unique sparse solution in high dimensions.
-
James Yang and Trevor Hastie. Note on the Equivalence of Orthogonalizing EM and Proximal Gradient Descent. Xiong et al. (Technometrics 2016) developed a method called orthogonalizing EM (OEM) to solve penalized regression problems for tall data. While OEM is developed in the context of the EM algorithm, we show that it is, in fact, an instance of proximal gradient descent, a popular first-order convex optimization algorithm. Technometrics (2024).
-
Tetiana Parshakova, Trevor Hastie, Stephen Boyd. Fitting Multilevel Factor Models. We examine a special case of the multilevel factor model, with covariance given by a multilevel low rank (MLR) matrix (see Tetiana et al in 2023 below). We develop a novel, fast implementation of the EM algorithm, tailored for multilevel factor models, to maximize the Gaussian likelihood of the observed data. This method accommodates any hierarchical structure and maintains linear time and storage complexities per iteration. This is achieved through a new efficient technique for computing the inverse of the positive definite MLR matrix. This paper is accompanied by an open-source Python package that implements the proposed methods. SIAM Journal on Matrix Analysis and Applications (August 2025).
-
Asher Spector, Rina Foygel Barber, Trevor Hastie, Ronald N. Kahn, Emmanuel Candès. The mosaic permutation test: an exact and nonparametric goodness-of-fit test for factor models. Financial firms rely on factor models to explain correlations among asset returns. After major events, e.g., COVID-19, analysts may reassess whether existing models continue to fit well: specifically, after accounting for the factor exposures, are the residuals of the asset returns independent? With this motivation, we introduce the mosaic permutation test, a nonparametric goodness-of-fit test for preexisting factor models.
-
James Yang and Trevor Hastie. A Fast and Scalable Pathwise-Solver for Group Lasso and Elastic Net Penalized Regression via Block-Coordinate Descent. We develop an efficient algorithm for solving the group lasso/elastic net penalized regression problem. Our package adelie (implemented in Python and also in R with a package on CRAN) appears to be faster by factors of 3 to 10 than the next fastest package on CRAN for wide data. We deliver solutions for all the GLM and other families in glmnet, and when all the groups are size 1, the package matches the performance of glmnet.

2023
-
Ley C, Heath F, Hastie T, Gao Z, Protsiv M, Parsonnet J. Defining usual oral temperature ranges in outpatients using an unsupervised learning algorithm. JAMA Intern Med. Published online September 5, 2023. We built a model for making temperature predictions using personal demographics. Check out the temperature tool.
-
Elena Tuzhilina, Trevor Hastie and Mark Segal. Statistical Curve Models For Inferring 3D Chromatin Architecture. We describe new methodology for reconstructing three dimensional (3D) chromatin structure from conformation-capture assays (such as Hi-C). We use smoothing-spline penalties to create a continuous family of smooth curves. In addition we fit these models using loss functions more appropriate to the sparsity of the data at hand. These include zero-inflated Poisson and negative binomial losses. Annals of Applied Statistics, 2024.
-
Tetiana Parshakova, Trevor Hastie, Eric Darve, Stephen Boyd. Factor Fitting, Rank Allocation, and Partitioning in Multilevel Low Rank Matrices. We consider multilevel low rank (MLR) matrices, defined as a row and column permutation of a sum of matrices, each one a block diagonal refinement of the previous one, with all blocks low rank given in factored form. MLR matrices extend low rank matrices but share many of their properties, such as the total storage required and complexity of matrix-vector multiplication. We address three problems that arise in fitting a given matrix by an MLR matrix in the Frobenius norm.
-
Disha Ghandwani, Swarnadip Ghosh, Trevor Hastie, Art Owen. Scalable solution to crossed random effects model with random slopes. We develop scaleable algorithms for fitting crossed random effects model on e-commerce scale problems. Here we fit random slopes and intercepts at scale. EJS (2025).
-
Anav Sood and Trevor Hastie. A Statistical View of Column Subset Selection. We consider the problem of selecting a small subset of representative variables from a large dataset. In the computer science literature, this dimensionality reduction problem is typically formalized as Column Subset Selection (CSS). Meanwhile, the typical statistical formalization is to find an information-maximizing set of Principal Variables. This paper shows that these two approaches are equivalent, and moreover, both can be viewed as maximum likelihood estimation within a certain semi-parametric model. JRSSB (2025).

2022
-
Ismael Lemhadri, Harrison Li, Trevor Hastie. RbX: Region-based explanations of prediction models. We introduce region-based explanations (RbX), a novel, model-agnostic method to generate local explanations of scalar outputs from a black-box prediction model using only query access.
-
Elena Tuzhilina, Trevor Hastie, Daniel McDonald, J. Kenneth Tay, Robert Tibshirani. Smooth multi-period forecasting with application to prediction of COVID-19 cases. In this paper we consider the problem of multi-period forecasting that aims to predict several horizons at once. We propose a novel approach that forces the prediction to be "smooth" across horizons and apply it to two tasks: point estimation via regression and interval prediction via quantile regression. This methodology was developed for real-time distributed COVID-19 forecasting. We illustrate the proposed technique with the CovidCast dataset as well as a small simulation example. JCGS 2023 online.
-
Michael Greenacre, Patrick Groenen, Trevor Hastie, Alfonso Iodice d'Enza, Angelos Markos, and Elena Tuzhilina. Principal Component Analysis. A review article on PCA, illustrated using a variety of different applications. Nature Reviews Methods Primers 2023.
-
Samyak Rajnala, Stephen Bates, Trevor Hastie and Rob Tibshirani Confidence Intervals for the Generalisation Error of Random Forests. We use the OOB information to estimate jacknife standard errors for the OOB estimate of error.

2021
-
Added a new blog entry on Altered Priors. You build a classifier on some training data, but you would like to deploy it in a population where the class distribution (prior) is different. This comes up in case-control sampling, but also in other situations such as transfer learning.
-
Elena Tuzhilina and Trevor Hastie. Weighted Low Rank Matrix Approximation and Acceleration. We develop algorithms for computing an element-weighted low-rank matrix approximation (SVD) via projected gradient descent. We consider two acceleration schemes, Nesterov and Anderson, and discuss their implementation. We show how to scale these algorithms to high-dimensional problems.
-
Zijun Gao and Trevor Hastie. LinCDE: Conditional density estimation via Lindsey's method. Lindsey's method allows for smooth density estimation by turning the density problem into a Poisson GLM. In particular, we represent an exponential tilt function in a basis of natural splines, and use discretization to deal with the normalization. In this paper we extend the method to conditional density estimation via trees and then gradient boosting with trees. JMLR (2022). R package installable from GITHUB: install_github("ZijunGao/LinCDE"); see LinCDE vignette for examples.
-
Yosuke Tanigawa, Junyang Qian, Guhan Venkataraman, Johanne Justesen, Ruilin Li, Robert Tibshirani, Trevor Hastie, Manuel Rivas. Significant Sparse Polygenic Risk Scores across 428 traits in UK Biobank. In this survey across the more than 1,600 traits in the UK Biobank, we report 428 strongly significant (p<2.5e-5) sparse polygenic risk models, computed by the snpnet lasso model developed by this team.
-
Swarnadip Ghosh, Trevor Hastie and Art Owen. Scalable logistic regression with crossed random effects. We develop an approach for fitting crossed random-effect logisitic regression models at massive scales, with applications in ecommerce. We adapt a procedure of Schall (1991) and backfitting algorithms to achieve O(n) algorithms. EJS 2022 doi.org/10.1214/22-EJS2047
-
Zijun Gao and Trevor Hastie. DINA: Estimating Heterogenous Treatment Effects in Exponential Family and Cox Models. We extend the R-learner framework to exponential families and the Cox model. Here we define the treatment effect to be the difference in natural parameter or DINA.
-
Stephen Bates, Trevor Hastie and Rob Tibshirani. Cross-validation: what does it estimate and how well does it do it? Although CV is ubiquitous in data science, some of its properties are poorly understood. In this paper we argue that CV is better at estimating expected prediction error rather than the prediction error for the particular model fit to the training set. We also provide a method for computing the standard error of the CV estimate, which is bigger than the commonly used naive estimate which ignores the correlations in the folds. JASA, 2023, DOI
-
Elena Tuzhilina, Leonardo Tozzi and Trevor Hastie. Canonical Correlation Analysis in high dimensions with structured regularization.We develop structurally regularized versions of CCA for very high-dimensional MRI images from neuroscience experiments. Statistical Modelling, 2023, Vol 23.
-
J. Kenneth Tay, Balasubramanian Narasimhan and Trevor Hastie. Elastic Net Regularization Paths for All Generalized Linear Models. This paper describes some of the substantial enhancements to the glmnet R package ver 4.1+. All programmed GLM families are accommodated through a family() argument. We also discuss relaxed fits, and facilities for modeling stop/start data and strata in survival models. To appear, Journal of Statistical Software.

2020
-
Benjamin Haibe-Kains et al. Transparency and reproducibility in artificial intelligence. A note in Matters arising, Nature, where we are critical of the lack of transparency in a high-profile paper on using AI for breast-cancer screening.
-
Lukasz Kidzinski, Francis Hui, David Warton and Trevor Hastie. Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays JMLR 23(291) 2022 Generalized Matrix Factorization We fit large-scale generalized linear latent variable models (GLLVM) by using efficient methodology adapted from matrix factorization. Designed with large scale ecology studies in mind with tens of thousands of species and/or locations, these methods provide good approximations to the state-of-the-art variational Bayes methods that do not scale gracefully to such sizes.
-
Swarnadip Ghosh, Trevor Hastie and Art Owen. Backfitting for large scale crossed random effects regressions Regression models with crossed random-effects can be very expensive to compute in large scale applications (eg Ecommerce with millions of customers and many thousands of products). We adapt a backfitting algorithm to problems of this kind and achieve convergence (provably) linear in the number of observations. Annals of Statistics 2022
-
Elena Tuzhilina, Trevor Hastie and Mark Segal. Principal curve approaches for inferring 3D chromatin architecture We adapt metric scaling algorithms to approximate the 3D conformation of a chromosome based on contact maps. Biostatistics, Nov 2020.
-
Trevor Hastie. Ridge Regularization: an Essential Concept in Data Science This paper is written by request to celebrate the 50th anniversary of the first ridge-regression paper by Hoerl and Kennard (1070), Technometrics. In it I gather together some of my favorite aspects of ridge, and also touch on the "double-descent" phenomenon. Technometrics online link. The R markdown source for some of the figures is figureSource.rmd
-
Kenneth Tay, Nima Aghaeepour, Trevor Hastie, Robert Tibshirani. Feature-weighted elastic net: using "features of features" for better prediction We often have metadata that informs us about features to be used for supervised learning. For example, genomic features may belong to different pathways. Here we develop a method for exploiting this information to improve prediction. Statistica Sinica 2021.
-
Junyang Qian, Yosuke Tanigawa, Ruilin Li, Robert Tibshirani, Manuel A Rivas, Trevor Hastie. Large-Scale Multivariate Sparse Regression with Applications to UK Biobank We develop methodology to build large sparse polygenic risk score models using multiple phenotypes (multitask learning). Our methodology builds on our early snpnet lasso models for single phenotypes, and includes missing phenotype imputation and adjustment for confounders.
Here is a link to the code and scripts used in the paper https://github.com/junyangq/scripts_multiSnpnet_paper (Annals of Applied Statistics, vol 16(3), 2022.)

2019
-
Trevor Hastie, Andrea Montanari, Saharon Rosset and Ryan Tibshirani. Surprises in High-Dimensional Ridgeless Least Squares Interpolation. Interpolating fitting algorithms have attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type. In this paper, we study minimum L2-norm ("ridgeless") interpolation in high-dimensional least squares regression. We consider both a linear model and a version of a neural network. We recover several phenomena that have been observed in large-scale neural networks and kernel machines, including the "double descent" behavior of the prediction risk, and the potential benefits of overparametrization. Annals of Statistics, 2022 50(2) pp 949-986.
-
Didier Nibbering and Trevor Hastie Multiclass-penalized logistic regression We develop a model for clustering classes in multi-class logistic regression. Computational Statistics and Data Analysis, 2022
-
Zijun Gao, Trevor Hastie and Rob Tibshirani. Assessment of heterogeneous treatment effect estimation accuracy via matching We address the difficult problem of assessing the performance of an HTE estimator. Our approach has several novelties: a flexible matching metric based on random-forest proximity scores, an optimized matching algorithm, and a match then split cross-validation scheme. Statistics in Medicine, April 2021

2018
-
Junyang Qian, Yosuke Tanigawa, Wenfei Du, Matthew Aguirre, Chris Chang, Robert Tibshirani, Manuel A. Rivas, Trevor Hastie. A Fast and Scalable Framework for Large-scale and Ultrahigh-dimensional Sparse Regression with Application to the UK Biobank. PLOS Genetics October 2020. We develop a scalable lasso algorithm for fitting polygenic risk scores at GWAS scale. There is also a BiorXiv version. Our R package snpnet combines efficient batch-wise strong-rule screening with glmnet to fit lasso regularization paths on phenotypes in the UK Biobank data.
Here is a link to the code and scripts used in the paper https://github.com/junyangq/scripts_snpnet_paper -
Lukasz Kidzinski and Trevor Hastie Longitudinal data analysis using matrix completion We use a regularized form of matrix completion to fit functional principal component models, and extend these to other multivariate longitudinal regression models. We have an R package fcomplete which includes three vignettes demonstrating how it can be used. Published in JCGS 2023.

2017
-
Trevor Hastie, Rob Tibshirani and Ryan Tibshirani Extended Comparisons of Best Subset Selection, Forward Stepwise Selection, and the Lasso This paper is a follow-up to "Best Subset Selection from a Modern Optimization Lens" by Bertsimas, King, and Mazumder (AoS, 2016). We compare these methods using a broad set of simulations that cover typical statistical applications. Our conclusions are that best-subset selection is mainly needed in very high signal-to-noise regimes, and the relaxed lasso is the overall winner. Published as Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons. Statistical Science (2020) 35(4) 579-592. The supplement shows a comprehensive set of comparisons.
-
Scott Powers, Junyang Qian, Kenneth Jung, Alejandro Schuler, Nigam Shah, Trevor Hastie and Robert Tibshirani. Some methods for heterogeneous treatment effect estimation in high-dimensions We develop some new methods for estimating personalized treatment effects from observational data: causal boosting and causal MARS. Statistics in Medicine. (January 2018).

2016
-
Scott Powers, Trevor Hastie and Rob Tibshirani. Nuclear penalized multinomial regression with an application to predicting at-bat outcomes in baseball. Here we use a convex formulation of the reduced-rank multinomial model, in a novel application using a large dataset of baseball statistics. In special edition "Statistical Modelling for Sports Analytics", Statistical Modelling, vol. 18, 5-6: pp. 388-410.
-
Qingyuan Zhao and Trevor Hastie Causal Interpretations of Black-Box Models. We draw connections between Friedman's partial dependence plot and Pearl's back-door adjustment to explore the possibility of extracting causality statements after fitting complex models by machine learning algorithms. Finally published 2019 ASA/JBES, 39(1) 2019
-
Nicholas Boyd, Trevor Hastie, Stephen Boyd, Benjamin Recht, Michael Jordan. Saturating Splines and Feature Selection We use a convex framework based on TV (total variation) penalty norms for nonparametric regression. Saturation for degree-two splines requires the solution extrapolates as a constant beyond the range of the data. This along with an additive model formulation leads to a convex path algorithm for variable selection and smoothing with generalized additive models. JMLR 18(197):1-32, 2018

2015
-
Ian Renner, Jane Elith, Adrian Baddeley, William Fithian, Trevor Hastie, Steven Phillips, Gordana Popovic and David Warton. Point process models for presence-only analysis.. An informative review of relevant methods for analyzing presence-only data in Ecology, which ties together a number of "dueling" methods that have been used for years without clear justification. The IPP or Inhomogenous Poisson Process is the catalyst. Methods in Ecology and Evolution 2015, 6, 366–379. See also the Supplementary Information.
-
Scott Powers, Trevor Hastie and Robert Tibshirani Customized training with an application to mass spectrometric imaging of cancer tissue. Annals of Applied Statistics 9(4) (2015), 1709-1725.
-
Rakesh Achanta and Trevor Hastie Telugu OCR Framework using Deep Learning. We build an end-to-end OCR system for Telugu script, that segments the text image, classifies the characters and extracts lines using a language model.The classification module, which is the most challenging task of the three, is a deep convolutional neural network.
-
Jingshu Wang, Qingyuan Zhao, Trevor Hastie and Art Owen. Confounder Adjustment in Multiple Hypotheses Testing. (accepted, Annals of Statistics, 2016). We present a unified framework for analysing different proposals for adjusting for confounders in multiple testing (e.g. in genomics). We also provide an R package cate on CRAN that implements these different approaches. The vignette shows some examples of how to use it.
-
Alexandra Chouldechova and Trevor Hastie Generalized Additive Model Selection A method for selecting terms in an additive model, with sticky selection between null, linear and nonlinear terms, as well as the amount of nonlinearity. The R package gamsel has been uploaded to CRAN.

2014
-
Trevor Hastie, Rahul Mazumder, Jason Lee and Reza Zadeh. Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares We develop a new method for matrix completion, that improves upon our earlier softImpute algorithm, as well as the popular ALS algorithm. JMLR 2015 16 3367-3402. We have also incorporated this method in our R package softImpute.
-
Ya Le, Trevor Hastie. Sparse Quadratic Discriminant Analysis and Community Bayes We provide a framework for generalizing naive Bayes classification using sparse graphical models. Naive Bayes assumes conditional on the response, the variables are independent. We relax that, and assume their conditional dependence graph is shared and sparse.
-
William Fithian, Jane Elith, Trevor Hastie, David A. Keith. Bias Correction in Species Distribution Models: Pooling Survey and Collection Data for Multiple Species We develop methods for teasing out observer-bias in multi-species presence-only data. (submitted, arXiv:1403.7274) ( Methods for Ecology and Evolution, October 10, 2014). Github link to the R package multispeciesPP
-
Will Fithian and Trevor Hastie. Local Case-Control Sampling: Efficient Subsampling in Imbalanced Data Sets (arXiv). Modern classification tasks are often extremely unbalanced - a situation where case-control sampling can be very helpful. This paper discusses the bias of CC sampling, and proposes a simple two-stage subsampling procedure for removing this bias, and in cases dramatically improving the efficiency. Annals of Statistics 2014, Vol. 42, No. 5, 1693-1724. projecteuclid.org/euclid.aos/1410440622

2013
-
Lucas Janson, Will Fithian, and Trevor Hastie. Effective Degrees of Freedom: a Flawed Metaphor. The popular covariance formula for df gives some surprising results, like df>p in forward stepwise. May 2015, Biometrika
-
Stefan Wager, Trevor Hastie and Bradley Efron. Confidence Intervals for Random Forests: the Jackknife and the Infinitessimal Jackknife. We use ideas related to OOB errors to compute standard errors for bagging and random forests. Two approaches are presented, one based on the jacknife, the other on the infinitessimal jacknife. We study the bias of these estimates, as well as monte-carlo errors. . (JMLR 2014, 15 1625-1651)
-
David Warton, Bill Shipley and Trevor Hastie. CATS regression - a model-based approach to studying trait-based community assembly. We show how to use GLMs to fit community models, which are traditionally fit by maximum entropy. Apart from being a convenient platform for model fitting, a all the usual summaries, statistics and extensions of GLMs are available. ( Methods of Ecology and Evolution, September 29, 2014)
-
Hristo Paskov, Robert West, John Mitchell and Trevor Hastie. Compressive Feature Learning We use an unsupervised convex ducument compression algorithm to derive a sparse k-gram representation for a corpus of documents. This same dictionary, in the spirit of "deep learning", is as good as the original k-gram representation for document classification tasks. To appear, NIPS 2013.
-
My first blog post with Will Fithian. This post refers to concerns that were raised about cross-validation.
-
Noah Simon, Jerome Friedman and Trevor Hastie. A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression. We use the group lasso in the context of multinomial and multi-response regression. Each variable has multiple coefficients for the different responses, and they each get selected via a group lasso penalty. Our code is an efficient implementation of block coordinate descent, and is built into the glmnet package. (submitted, on arXiv).
-
Michael Lim and Trevor Hastie. Learning interactions via hierarchical group-lasso regularization. We use the overlap group lasso in the context of a linear model to test for interactions. Our methodology can handle qualitative as well as quantitative variables. Our R package glinternet can fit linear and logistic regression models. Optimized code can handle thousands of variables (our largest example had > 20K 3-level factors). arXiv:1308.2719 (JCGS 2014, online access)
-
Michael Jordan et al. Frontiers in Massive Data Analysis. This 129 page document is the report produced by the Committee on the Analysis of Massive Data. This committee was established by the National Research Council of the National Academies, and met four times over 2011-2012 in Washington and California. Michael Jordan was the chair of the 18 member committee, made up of statisticians (5), computer scientists and mathematicians. I was a member of the committee, and was jointly responsible for Chapter 7 with David Madigan, although all committee members provided input to all chapters as well.
-
Trevor Hastie and Will Fithian. Inference from Presence-only Data: the Ongoing Controversy. This short paper argues strongly against the use of rigid parametric logistic regression models to make inferences from presence-only data in Ecology. Essentially the rigidity of the model manufactures information that is not present in the data. Ecography (2013, editors choice) Video interview with David Warton at Ecostats conference at UNSW in Sydney in July 2013. David was wearing his editor's hat for Methods in Ecology and Evolution, and the discussion centered on this paper.
-
Noah Simon, Jerome Friedman, Trevor Hastie and Rob Tibshirani. The Sparse Group Lasso By mixing L1 penalties with group-lasso L2 penalties, we achieve a sparse group lasso where some members of a group can end up being zero. JCGS, May 2013, 22(2), pages 231-245.
-
Jianqiang Wang and Trevor Hastie. Boosted Varying-Coefficient Regression Models for Product Demand Prediction. We use the varying coefficient paradigm to fit a market segmented product demand model, with boosted regression trees as the nonparametric component. JCGS
-
Julia Viladomat, Rahul Mazumder, Alex McInturff, Douglas McCauley and Trevor Hastie Assessing the significance of global and local correlations under spatial autocorrelation; a nonparametric approach. Variables collected over a spatial domain often exhibit strong spatial autocorrelation. When such variables are used in a regression, pairwise correlation analysis, or in the popular geographically weighted regression, it can be difficult to assess significance. We propose a general approach based on randomization followed by smoothing to restore the spatial correlation structure. R code code used in the paper. (Biometrics, Jan 2014)

2012
-
Will Fithian and Trevor Hastie. Finite-sample equivalence in statistical models for presence-only data. We show that a lot of different approaches to presence-only data are the same, in particular inhomogenous poisson proccesses, maxent, and naive logistic regression (when weighted appropriately). (AoAS 2013 7 (4) 1917-1939 ).
-
Jason Lee and Trevor Hastie. Learning Mixed Graphical Models. We use group-lasso regularized pseudo likelihood for learning the structure of a graphical model with mixed discrete and continuous variables. Our model respects the symmetry imposed by a Markov random field representation --- each of the potentials gets a vote from a pair of regression models (Gaussian, logistic or multinomial), where each of the pair of variables is the response and predictor. (in Arxiv), JCGS 24(1) 2015. Go to Jason Lee's webpage for matlab code and a demo.
-
Rahul Mazumder, Jerome Friedman and Trevor Hastie. Sparsenet R package on CRAN. Fits sparse solution paths for linear models (square-error loss) using coordinate descent with MC+ penalty family. Software is very fast, and can handle many thousands of variables. Functions for cross-validation, prediction, plotting etc. Based on algorithms described in SparseNet : Coordinate Descent with Non-Convex Penalties. JASA 2011, 106(495) 1125-1138.

2011
-
Rahul Mazumder and Trevor Hastie Exact Covariance Thresholding into Connected Components for large-scale Graphical Lasso A block screening rule for graphical models that can dramatically speed up computations, and allow for distributed computing of models on a much larger scale. Original copy arXiv 8/18/2011 published JMLR March 2012 13, 723-736.
-
Rahul Mazumder and Trevor Hastie The Graphical Lasso: New Insights and Alternatives (arXiv 11/23/2011, published November 2012) We examine the glasso algorithm for solving the graphical lasso problem. We show that it solves the dual problem, where the optimization variable is the covariance rather than the precision matrix. We propose a similar primal algorithm, which appears to be superior in speed and other propertiers. Electronic Journal of Statistics 20126 2125-2149. R package dpglasso
-
Gen Nowak; Trevor Hastie; Jonathan R. Pollack; Robert Tibshirani A fused lasso latent feature model for analyzing multi-sample aCGH data. Biostatistics (advanced access June 2011). A new approach for modeling multisample CGH data, that exploits the similarities in copy-number variation at the same locations in the genome. Here is the online link at the Biostatistics journal website. The R package FLLat is available from CRAN.
-
Noah Simon, Jerome Friedman, Trevor Hastie and Rob Tibshirani Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. We develop the tools needed to include the Cox proprtional hazard's model in GLMNET. Journal of Statistical Software 39(5), 1-13 (2011).

2010
-
Rob Tibshirani, Jacob Bien, Jerome Friedman, Trevor Hastie, Noah Simon, Jonathan Taylor and Ryan Tibshirani: Strong rules for discarding predictors in lasso-type problems. We develop rules for screening predictors for lasso and elastic-net penalized models. When p>>n, this can result in large computational savings, without any loss in accuracy. JRSS B (2012) 74
-
Jane Elith, Steven Phillips, Trevor Hastie, Miroslav Dudik, Yung En Chee and Colin Yates A statistical explanation of Maxent for Ecologists Maxent is a method for modeling species prevalence with presence-background data. This paper explains maxent using the language of a statistical models. Diversity and Distributions November 2010
-
Jerome Friedman, Trevor Hastie and Rob Tibshirani: Applications of the lasso and grouped lasso to the estimation of sparse graphical models We develop efficient algorithms for fitting sparse, undirected graphical models. These are specially designed for the "large p" situation.
-
Jerome Friedman, Trevor Hastie and Robert Tibshirani: A Note on the Group Lasso and a Sparse Group Lasso. We develop a group lasso with both sparsity of groups and sparsity within groups. We also develop coordinate-wise algorithms for fitting the both cases.

2009
-
Rahul Mazumder, Jerome Friedman and Trevor Hastie: SparseNet : Coordinate Descent with Non-Convex Penalties. JASA 2011, 106(495) 1125-1138. Non-convex penalties produce sparser models than the LASSO, but pose difficulties for optimization. We propose a structured algorithm using coordinate descent which finds good solutions with guaranteed convergence properties. Appendix for SparseNet paper with extra figures and some additional technical proofs.
sparsenet R package available from CRAN (Feb 2012) -
Rahul Mazumder, Trevor Hastie and Rob Tibshirani: Spectral Regularization Algorithms for Learning Large Incomplete Matrices. We develop an iterative algorithm for matrix completion using nuclear-norm regularization. JMLR 2010 11 2287-2322
MATLAB package SoftImpute for matrix completion (zip archive).
R package to appear soon. -
Daniela Witten, Rob Tibshirani and Trevor Hastie: A penalized matrix decomposition, with applications to sparse canonical correlation analysis and principal components Biostatistics 10(3)
-
Trevor Hastie, Robert Tibshirani and Jerome Friedman, Elements of Statistical Learning: Data Mining, Inference and Prediction (Second Edition). February, 2009. 745 pages in full color. Springer-Verlag, New York.
This second edition adds 4 new chapters: Random Forests, Ensemble Learning, Undirected Graphical Models, and High Dimensional Problems: p>>N. For more details see ESL book homepage.
In an agreement with Springer, we are able to offer for free the ESL book pdf (13Mb).

2008
-
Tong Tong Wu, Yi Fang Chen, Trevor Hastie, Eric Sobel and Kenneth Lange: Genomewide Association Analysis by Lasso Penalized Logistic Regression We develop efficient computational procedures for screening large-scale genome-wide association studies. Bioinformatics 25(6): 714-721, 2009.
-
Ping Li, Ken Church and Trevor Hastie: One sketch for all: theory and application of conditional random sampling Nips08 proceedings.
-
Jerome Friedman, Trevor Hastie and Robert Tibshirani: Regularization Paths for Generalized Linear Models via Coordinate Descent. We use coordinate descent to develop regularization paths for linear, logistic and multinomial regression models. Our algorithms use the "elastic net" penalties of Zou and Hastie (2005), and create the path for a grid of values of the penalty parameter lambda. Journal of Statistical Software, 33(1), 2010
The R package glmnet is available from CRAN
A matlab wrapper for the glmnet fortran code, written by Hui Jiang. -
Jane Elith, John Leathwick and Trevor Hastie: A working guide to boosted regression trees (2008) Journal of Animal Ecology, 77 802-813. Here are the online supplement materials, along with the associated zip file. In 2004 received award for most highly cited paper in any of the British Ecological Society journals in the past 5 years.
-
Line Clemmensen, Trevor Hastie, Daniela Witten and Bjarne Ersboll: Sparse Discriminant Analysis. We extend penalized linear and mixture discriminant analysis by incorporating a lasso penalty to encourage sparseness. Technometrics (2011)
53(4) 406-413.
-
John Leathwick, Jane Elith, W. Chadderton, D. Rowe and Trevor Hastie: Dispersal, disturbance and the contrasting biogeophraphies of New Zealand's diadromous and non-diadronous fish species. An application of boosted regression trees in ecological mapping. J. Biogeography 2008, 35 1481-1497.
-
Debashis Paul, Eric Bair, Trevor Hastie and Robert Tibshirani: "Preconditioning" for feature selection and regression in high-dimensional problems We show that supervised principal components followed by a variable selection procedure is an effective approach for variable selection in very high dimension. Annals of Statistics 36(4), 2008, 1595-1618.

2007
-
Jerome Friedman, Trevor Hastie and Robert Tibshirani, Sparse inverse covariance estimation with the lasso.
-
Trevor Hastie Comment on a paper in Statistical Science by Peter Bühlmann and Torsten Hothorn: Boosting Algorithms:Regularization, Prediction and Model Fitting (2007) 22(4),477-522.
-
Jerome Friedman, Trevor Hastie and Robert Tibshirani, Discussion of "Evidence contrary to the statistical view of boosting (David Mease and Aaron Wyner)" Wyner and Mease show through examples some counter-intuitive results with boosting, that appear to contradict our 2000 paper. We discount these claims by reversing their results using shrinkage along with boosting. JMLR9 (2008) 59-64.
-
Jerome Friedman, Trevor Hastie, Holger Hoefling and Robert Tibshirani, Pathwise Coordinate Optimization. We show how coordinate descent algorithms can efficiently solve a number of popular regularized optimization problems, creating an entire path of solutions. We generalize this approach to derive an efficient algorithm for the fused lasso, both one- and two-dimensional. Annals of Applied Statistics (2007), 1(2), 302-332.
-
Ping Li, Trevor Hastie and Kenneth Church. Nonlinear Estimators and Tail Bounds for Dimension Reduction in L1 using Cauchy Random Projections. We provide improved methods for approximating L1 distances in very high dimensions, based on maximum-likelihood estimation in the Cauchy family. JLMR 8, pp 2497-2532
-
Ping Li and Trevor Hastie. A Unified Near-Optimal Estimator for Dimension Reduction in L_a (0< a <= 2) Using Stable Random Projections. NIPS2007 poster presentation
-
Brad Efron, Trevor Hastie and Rob Tibshirani, Discussion of the "Dantzig Selector" by Emmanuel Candes and Terrence Tao. Candes and Tao propose an alternative but similar procedure to the lasso. This discussion appears alongside the orginal article in the Annals of Statistics 35(6), pp2358-2364.
-
Trevor Hastie, Jonathan Taylor, Robert Tibshirani and Guenther Walther, Forward Stagewise Regression and the Monotone Lasso We characterize the incremental forward stagewise procedure as a monotone version of the lasso. Electronic Journal of Statistics 1 (2007).

2006
-
Trevor Hastie and Ji Zhu, Discussion of "Support Vector Machines with Applications" by Javier M. Moguerza and Alberto Munoz, Statistical Science 21(3) 352-357.
-
Gill Ward, Trevor Hastie, Simon Barry, Jane Elith and John Leathwick, Presence-only data and the EM algorithm. We develop a method for fitting the two-class logistic regression model using labeled data from one class, a sample of unlabeled data, and knowledge of the class prevalences.Biometrics,65(2)554-563, 2009.
Download the beta version of Gill Ward's R package ecogbm - ecogbm_1.01.tar.gz - for fitting boosted regression models for presence-only data. See the student section of my homepage for Gill Ward's thesis.
A presentation by Gill Ward based on this work "Making the Best Use of Available Data: The Presence-Only Problem in Ecology" won an honorable mention award at the 2007 Joint Statistical Meetings. -
John Leathwick, Jane Elith and Trevor Hastie, Comparative performance of generalized additive models and multivariate adaptive regression splines for statistical modelling of species distributions. (2006) Ecological Modelling 199 188-196. This is a special issue of the journal devoted to the workshop on Advances in Predictive Species Distribution Models held in Riederalp,Switzerland, 2004.

-
Ping Li, Trevor Hastie and Kenneth Church, Very Sparse Random Projections. A method for approximating pairwise distances in very high-dimensional spaces. Best student paper, KDD-06, Philadelphia
-
Mee-Young Park and Trevor Hastie, Regularization Path Algorithms for Detecting Gene Interactions.We develop a path algorithm for fitting the "cosso" models of Yuan & Lin (2006) with logistic regression. This allows factors and interactions to enter the model in a smooth way.
-
Mee-Young Park and Trevor Hastie, Penalized Logistic Regression for Detecting Gene Interactions. A modified version of forward-stepwise logistic regression suitable for screening large numbers of gene-gene interactions.
stepPlr: R package for fitting PLR models. -
Mee-Young Park, Trevor Hastie and Rob Tibshirani, Averaged gene expressions for regression A regression method that combines the lasso with hierarchical clustering, intended for selecting groups of genes in microarray problems. Biostatistics (2007, 8, 212-227). R Software for fitting these models.
-
Yaqian Guo, Trevor Hastie and Robert Tibshirani Regularized Discriminant Analysis and its Application in Microarrays. A method, similar to shrunken centroids, for classification and discrimination of microarrays, using regularized discriminant analysis with gene selection. Biostatistics (in press; epub)
-
Mee-Young Park and Trevor Hastie, An L1 Regularization-path Algorithm for Generalized Linear Models. A generalization of the LARS algorithm for GLMs and the Cox proportional hazard model. Since the coefficient paths are piecewise-nonlinear, approximations are made using the predictor-corrector algorithm of convext optimization. glmpath: R software package for fitting L1 regularized GLMs and Cox models. (JRSSB 2007 (69, part 4), pages 659-677 )
-
Ping Li, Trevor Hastie and Kenneth Church, Improving Random Projections Using Marginal Information. Methods for speeding up document search and characterization. Accepted at Colt 2006 This paper draws on results in the following two technical reports:
-
Ping Li, Trevor Hastie, and Kenneth Church,
Margin-constrained Random Projections and Very Sparse Random
Projections.
Ping Li, Kenneth Church and Trevor Hastie, A Sketch-based Sampling Algorithm on Sparse Data -
Rob Tibshirani and Trevor Hastie, Margin Trees for High-dimensional Classification. A tree-structured representation for a multiclass SVM classifier.
-
Hui Zou, Ji Zhu and Trevor Hastie, New Multicategory Boosting Algorithms Based on Multicategory Fisher-Consistent Losses. We provide some general requirements for multiclass margin-based classifiers. Annals of Applied Statistics 2(4) pp 1290-1306, 2008).

2005
-
Ji Zhu, Hui Zhou, Saharon Rosset and Trevor Hastie, Multi-class Adaboost. A multi-class generalization of the Adaboost algorithm, based on a generalization of the exponential loss.
Finally published in 2009 in Statistics and Its Interface Volume 2 (2009) 349-360.
Code on Ji Zhu's website -
J. Leathwick, J. Elith, M. Francis, T. Hastie, P. Taylor. Variation in demersal fish species richness in the oceans surrounding New Zealand: an analysis using boosted regression trees.
(Marine Ecology Progress Series, published in 2006). A detailed analysis of species abundance using Poisson regression with boosted regression trees. All analysis done using the gbm package in R (Greg Ridgeway). -
J. Leathwick, D. Rowe, J. Richardson, J. Elith and T. Hastie, Using multivariate adaptive regression splines to predict the distributions of New Zealand's freshwater diadromous fish. Freshwater Biology 50 2034-2051.
Presence-absence species data are modelled using a MARS along with GLM in R. -
Mee-Young Park and Trevor Hastie, Hierarchical Classification using Shrunken Centroids. A technique for classification when the number of classes is large. It produces an hierarchically structured classification rule, with the hardest-to-separate classes at the terminal nodes.
-
Hui Zou and Trevor Hastie. Regularization and Variable Selection via the Elastic Net (pdf). JRSSB (2005) 67(2) 301-320. A compromise between ridge regression and the lasso, with the computational advantages of the lasso. The elastic net selects variables in correlated sets. An R package elasticnet is available from CRAN.
See interview on Essential Science Indicators.
Published minor correction.

2004
-
Hui Zou, Trevor Hastie, and Rob Tibshirani, On the "Degrees of Freedom" of the Lasso. A technical paper that establishes that the number of non-zero coefficients in a lasso model is unbiassed for the effective degrees of freedom. Published in Annals of Statistics (2007), 35, 5, 2173-2192.
-
Philip Beineke, Trevor Hastie and Shivakumar Vaithyanathan, The Sentimental Factor: Improving Review Classification via Human-Provided Information Proceedings ACL 2004, Barcelona. (ACL: Association of Computational Linguistics)
-
Eric Bair, Trevor Hastie, Debashis Paul, and Robert Tibshirani Prediction by Supervised Principal Components Published in JASA 2006 101 No 473, pp 11-137.
-
NIPS2004 - The following papers were accepted for NIPS 2004:
-
Trevor Hastie, Saharon Rosset, Rob Tibshirani and Ji Zhu,
The Entire Regularization Path for the Support Vector Machine (oral presentation)
Saharon Rosset, Hui Zou, Ji Zhu and Trevor Hastie, A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning (poster presentation)
-
Hui Zou, Trevor Hastie, and Rob Tibshirani. Sparse Principal Component Analysis. We present a new approach to principal component analysis, that allows us to use an L1 penalty to ensure sparseness of the loadings. Published in JCGS 2006 15(2): 262-286. Software is available in R package elasticnet available from CRAN.
-
Trevor Hastie, Saharon Rosset, Rob Tibshirani and Ji Zhu. The Entire Regularization Path for the Support Vector Machine. JMLR, 5(Oct) 1391-1415. An algorithm for computing the two-class SVM solution for all possible values of the regularization parameter C, at essentially the computational cost of a single SVM fit. Not only does this allow for efficient model selection, but it also exposes the role of regularization for SVMs. Several MPEG movies show the sequence of solutions for different examples. SvmPath software package for R.
-
Trevor Hastie and Robert Tibshirani. Efficient Quadratic Regularization for Expression Arrays. Biostatistics (2004), 5(3), pp 329-340. Computational tricks for a large class of linear models fit by quadratic regularization.

2003
-
Jerome Friedman, Trevor Hastie, Saharon Rosset, Rob Tibshirani and Ji Zhu. Discussion of three Boosting papers Annals of Statistics, 2004, vol 32 (1) pp 102-107. The three papers are by (1) Wenxin Jiang, (2) Gabor Lugosi and Nicolas Vayatis, and (3) Tong Zhang.
-
Saharon Rosset, Ji Zhu and Trevor Hastie. Margin Maximizing Loss Functions (accepted poster Nips 2003)
-
Ji Zhu, Saharon Rosset, Trevor Hastie and Rob Tibshirani. 1-Norm Support Vector Machines (accepted spotlight poster Nips 2003)
-
Mu Zhu, Trevor Hastie and Guenther Walther. Constrained Ordination Analysis with Flexible Response Functions Constrained ordination via nonparametric discriminant analysis. Ecological Modelling (2005), 187(4), 524--536.
-
Francesca Dominici, Aidan McDermott and Trevor Hastie. Improved Semiparametric Time Series Models of Air Pollution and Mortality [pdf Technical Report] JASA, December 2004, 99(468), 938-948. [Sofware]
-
Ji Zhu and Trevor Hastie Classification of Gene Microarrays by Penalized Logistic Regression. Biostatistics 5(3):427-443. R Code (from Ji Zhu's website)

2002
-
Trevor Hastie and Robert Tibshirani. Independent Component Analysis through Product Density Estimation (ps file). A direct statistical approach to ICA, using an attractive spline representation to model each of the marginal densities.A more recent (Nov 2002) talk (pdf)
-
Saharon Rosset, Ji Zhu and Trevor Hastie. Boosting as a Regularized Path to a Maximum Margin Classifier (pdf file) JMLR 5 (Aug 2004): 941--973, 2004. We show that a version of boosting fits a model by optimizing a L1-penalized loss function. This in turn shows that the corresponding versions of Adaboost and Logitboost converge to an "L1" optimal separating hyperplane.
-
Support Vector Machines, Kernel Logistic Regression and Boosting . Slides for talk given at Spring research conference in Michigan, MCS2002 in Sardinia, NPCONF2002 in Crete, and ASA2002 in New York.
-
Hongjuan Zhao, Trevor Hastie, Dr Michael L Whitfield, Prof Anne-Lise Borresen-Dale and Dr. Stefanie S Jeffrey
Optimization and evaluation of T7 based RNA linear amplification protocols for cDNA microarray analysis
BMC Genomics 2002, 3:31 (30 Oct 2002) Biomed central online -
Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, and Gilbert Chu. Class prediction by nearest shrunken centroids, with applications to DNA microarrays (ps file) (pdf file) This is a more statistical version of the PNAS paper below.
-
Rob Tibshirani, Trevor Hastie, B. Narashiman and Gilbert Chu: "Diagnosis of multiple cancer types by shrunken centroids of gene expression" (PNAS website). PNAS 2002 99:6567-6572 (May 14). See PAM website for software (available soon).
-
Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani, Least Angle Regression Annals of Statistics (with discussion) (2004) 32(2), 407-499. A new method for variable subset selection, with the lasso and "epsilon" forward stagewise methods as special cases. LARS Software for R and Splus.
-
Antoine Guisan, Thomas Edwards and Trevor Hastie Generalized linear and generalized additive models in studies of species distributions: setting the scene. Ecological Modeling (2002) 157, 89-100.

2001
-
Trevor Hastie, Robert Tibshirani and Jerome Friedman, "Elements of Statistical Learning: Data Mining, Inference and Prediction" Springer-Verlag, New York.
-
Mu Zhu and Trevor Hastie, "Feature extraction for non-parametric discriminant analysis" JCGS (2003, 12(1), pages 101-120.
-
Ji Zhu and Trevor Hastie, "Kernel Logistic Regression and the Import Vector Machine", (NIPS, 2001; JCGS 2005). Copy of slides(pdf) presented by TH in Kyoto in December, 2001.
Ji Zhu's R code for fitting IVM models. -
Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, Michael Eisen, Gavin Sherlock, Pat Brown, and David Botstein
Exploratory screening of genes and clusters from microarray experiments (ps file) or pdf version. -
Therese Sorlie, Perou, C., Robert Tibshirani, Turid Aas, Stephanie Geisler, Hilde Johnsen, Trevor Hastie, Michael B. Eisen, Matt van de Rijn, Stefanie S. Jeffrey, Thor Thorsen, Hanne Quist, John C. Matese, Patrick O. Brown, David Botstein, Per Eystein Lonninngg, and Anne-Lise Borresen-Dale. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. PNAS 98: 10869-10874. pdf version.

2000
-
Trevor Hastie, Robert Tibshirani, David Botstein and Pat Brown, "Supervised Harvesting of Expression Trees" (postscript) . Starting from a hierarchically clustered expression array, we build a predictive model for an outcome variable using cluster nodes as inputs.
(pdf version) Tech. report. August 2000. -
Olga Troyanskaya, Michael Cantor, Gavin Sherlock, Pat Brown, Trevor Hastie, Robert Tibshirani, David Botstein and Russ B. Altman, Missing value estimation methods for DNA microarrays BIOINFORMATICS Vol. 17 no. 6, 2001 Pages 520-525
-
Eva Cantoni and Trevor Hastie "Degrees-of-Freedom Tests for Smoothing Splines." Tech Report, May 2000.
Published in Biometrika 2002, 89, 251-263. A mixed-effects framework for smoothing splines and additive models allows for exact tests between nested models of different complexity. The complexity is calibrated via the effective degrees of freedom. -
Thomas Yee and Trevor Hastie. Reduced Rank Vector Generalized Linear Models (2003) Statistical Modeling, 3, pages 15-41. Using the multinomial as a primary example, we propose reduced rank logit models for discrimination and classification. This is a conditional version of the reduced rank model of linear discriminant analysis.
-
Robert Tibshirani, Guenther Walther and Trevor Hastie.
"Estimating the number of clusters in a dataset via the Gap statistic". Journal of the Royal Statistical Society, B, 63:411-423,2001. -
Stochastic Modeling and Tracking of Human Motion, a joint project with Dirk Ormoneit and Michael Black's group at Xerox Parc, with motion graphics demonstrations of learned walking characteristics.
-
Page 50 of "Generalized Additive Models" by Hastie and Tibshirani, 1990, Chapman and Hall. Some copies of the 1999 printing by CRC Press replaced page 50 with a page from a history text! page50.ps or page50.pdf
-
Trevor Hastie, Laura Bachrach, Balasubramanian Narasimhan and May Choo Wang. Flexible Statistical Models for Growth Fragments: a Study of Bone Mineral Acquisition Compare your own measurements using our online growth tables
-
Gareth James and Trevor Hastie Functional Linear Discriminant Analysis for Irregularly Sampled Curves (2001) Journal of the Royal Statistical Society, Series B JRSS B 63, 533-550.
-
Trevor Hastie, Robert Tibshirani, Michael B Eisen, Ash Alizadeh, Ronald Levy, Louis Staudt, Wing C Chan, David Botstein, Patrick Brown. 'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns This is an online version of the paper, published in the online journal GenomeBiology.
-
Trevor Hastie, Robert Tibshirani, Michael Eisen, Pat Brown, Doug Ross, Uwe Scherf, John Weinstein, Ash Alizadeh, Louis Staudt, David Botstein "Gene Shaving: a New Class of Clustering Methods for Expression Arrays". Postscript (2.9mb) or Adobe pdf (5.4mb) Tech. report. Jan 2000.
-
James, G., Hastie, T., and Sugar, C. "A Principal Component Models for Sparse Functional Data". (2000, Biometrika, 87, 587-602) (pdf). When the data are collections of sampled curves or images, functional principal components produce the principal modes of variation. Here we generalize these procedures to deal with the case when each curve is sparsely and irregularly sampled.

1999
-
Hastie, T., Tibshirani, R., Sherlock, G., Eisen, M., Brown, P. and Botstein, D. "Imputing Missing Data for Gene Expression Arrays". Technical report (1999), Stanford Statistics Department. pdf (145Kb) or postscript (450Kb)
-
Tibshirani, R., Hastie, T. Eisen, M., Ross, D. , Botstein, D. and Brown, P. "Clustering methods for the analysis of DNA microarray data". Postscript (4.8mb) or Compressed Postscript (1.8mb) Tech. report Oct. 1999.
-
D. Ormoneit and T. Hastie. Optimal kernel shapes for local linear regression. In S. A. Solla, T. K. Leen, and K-R. Müller, editors, Advances in Neural Information Processing Systems 12. The MIT Press, 2000.
-
Tibshirani, R. and Lazzeroni, L. and Hastie, T. and Olshen, A. and Cox, D.R. "A Global Pairwise Approach to Radiation Hydrid Mapping". Technical Report January 1999. Using data of co-occurrence of hybridized markers after shattering, inference is made of the marker sequence in the chromosome.

1998
-
Friedman, J., Hastie, T. and Tibshirani, R. (Published version) Additive Logistic Regression: a Statistical View of Boosting Annals of Statistics 28(2), 337-407. (with discussion)
We show that boosting fits an additive logistic regression model by stagewise optimization of a criterion very similar to the log-likelihood, and present likelihood based alternatives. We also propose a multi-logit boosting procedure which appears to have advantages over other methods proposed so far. Here are the slides (2 per page) for my boosting talk. -
Crellin, N., Hastie, T. and Johnstone, I. "Statistical Models for Image Sequences" Technical report, submitted to "Human Brain Mapping". We study fMRI sequences of the human brain obtained from experiments involving repetitive neuronal activity. We investigate the function form of the hemodynamic response function, and provide evidence that the commonly adopted convolution model is inadequate.
-
Hastie, T. and Tibshirani, R. Bayesian Backfitting. The Gibbs sampler looks and feels like the backfitting algorithm for fitting additive models. Indeed, a simple modification to backfitting turns it into a Gibbs sampler for spitting out samples from the "posterior" distribution for an additive fit. Statistical Science 15, no. 3 (2000), 196-223
-
Wu, T.,Hastie, T., Schmidler, S. and Brutlag, D. "Regression Analysis of Multiple Protein Structures" Models for lining up and averaging groups of protein structures.

1997
-
Rubinstein, D. and Hastie, T. "Discriminative vs Informative Learning" A comparison of two frequently used but different paradigms for training classifiers.
-
Maes, S. and Hastie, T. "Dynamic Mixtures of Splines: a Model for Saliency Grouping in the Time Frequency Plane" This is an application of mixture modeling to speech data. We use a moving mixture of Gaussians to represent the formant-frequencies in speech data.
-
Hastie, T., and Tibshirani, R. and Buja, A. "Flexible Discriminant and Mixture Models" In edited proceedings of "Neural Networks and Statistics" conference, Edinburgh, 1995. J. Kay and D. Titterington, Eds. Oxford University Press.
-
Wu, T., Schmidler, S., Hastie, T., and Brutlag, D. "Modelling and superposition of multiple protein structures using affine transformations: analysis of the globins"
-
James, G., and Hastie, T. "Generalizations of the Bias/Variance Decomposition for Prediction Error". Several papers have recently appeared on this topic, and each have a different viewpoint and decomposition. We hope ours does not add to the confusion.
-
James, G., and Hastie, T. Error Coding and PaCTs. Gareth James' winning paper in the ASA student paper competition for the Statistical Computing Section. This is one of four winning papers.

1996
-
Hastie, T. "Neural Networks", to appear in Encyclopaedia of Biostatistics. A brief survey with some personal points of view.
-
Hastie, T., and Tibshirani, R. "Classification by Pairwise Coupling" We solve a multiclass classification problem by combining all the pairwise rules. This paper builds on ideas proposed by J. Friedman. An abbreviated version is published in Advances in Neural Information Processing Systems 10, M. I. Jordan, M. J. Kearns, S. A. Solla, eds., MIT Press, 1998.
-
Hastie, T. and Tibshirani, R. "Generalized Additive Models" to appear in "Encyclopaedia of Statistical Sciences". A survey paper on GAMs.

1995
-
Hastie, T. Pseudosplines. We develop an approximation to any smoother, but typically a smoothing spline. The idea is to develop an orthonormal basis and penalty sequence, which would then be used via a generalized ridge regularization. JRSSB(1996) 58(2), 379-396.
-
Hastie, T. and Simard, P. "Models and Metrics for Handwritten Character Recognition". This paper gives a brief survey of techniques for handwritten digit recognition, and then goes into some particular technique based on invariant distance in some detail. Statistical Science 13(1) 1998, pp 54-65. (published version)
-
Hastie, T. and Tibshirani, R. "Discriminant Adaptive Nearest Neighbor Classification." IEEE PAMI, 18, 607-616, 1996.
-
Hastie, T. and Tibshirani, R. Generalized Additive Models for Medical Research. Encyclopaedia for Biostatistics 4 187-196

1994
-
Hastie, T. and Tibshirani, R. "Discriminant Analysis by Gaussian Mixtures." JRSSB (Jan 1996). There is also a longer technical report of Feb 1994
-
Hastie, T. J., Buja, A., and Tibshirani, R. "Penalized Discriminant Analysis." (Bell Labs technical report; postscript). Pdf of published version, Annals of Statistics, 1995.
-
Hastie, T.J., Simard, P. Y., and Saeckinger, E. "Learning Prototype Models for Tangent Distance." NIPS proceedings, 1994.
-
Hastie, T.J and Tibshirani, R. "Handwritten Digit Recognition via Deformable Prototypes." AT&T Bell Laboratories Technical Report, 1994.

1993
-
Hastie, T. J., Tibshirani, R. and Buja, A. "Flexible Discriminant Analysis by Optimal Scoring." (Bell Labs Technical report; postscript). Pdf of published version, JASA, December 1994.
-
Roosen, C. B. and Hastie, T. "Automatic Smoothing Spline Projection Pursuit." AT&T Bell Labs Technical Report (Dec. 1993). Journal of Computational and Graphical Statistics 3(3) Sep 1994, 235-248. Here is JSTOR link
-
Roosen, C. B. and Hastie, T. "Logistic Response Projection Pursuit." AT&T Bell Labs Technical Report (Aug. 1993).

1992
-
Hastie, T. J., Kishon, E., Clark, M., and Fan, J. "A Model for Signature Verification." AT&T Bell Laboratories Technical Report (Feb. 1992).

1990
-
Hastie, T. J., and Pregibon, D. "Shrinking Trees." AT&T Bell Laboratories Technical Report (March 1990). Unpublished manuscript. Thanks to Mu Zhu for turning the pre-web technical memorandum into an online document.

1989
-
Buja, A. Hastie, T. J., Tibshirani, R. Linear Smoothers and Additive Models An overview of linear smoothing technology, including a proof of the convergence of the backfitting algorithm. Annals of Statistics (with discussion) 1989, 17(2) 453-555.
-
Hastie, T. and Stuetzle, W. Principal Curves Journal of Amarican Statistical Association 1989, 84 502-516
-
Hastie, T., Botha, J., Schnitzler, C. Regression with an Ordered Categorical Response Statistics in Medicine 1989, 8 785-794

1987
-
Hastie, T. J., and Pregibon, D. "A new algorithm for matched case-control studies with applications to additive models." AT&T Bell Laboratories Technical Report (November 1987). Unpublished manuscript. A shorter version appeared in the proceedings of Compstat 1988
-
Hastie, T. and Little, F. Principal Profiles A nonlinear version of principal components for compositional data, that helps explain the horse-shoe effect. Published in Proceedings of the Interface Meeting (Comp.Sci and Stat), 1987.
-
Greenacre, M. and Hastie, T. A Geometric Interpretation of Correspondence Analysis There are many ways to think of CA. This paper presents it as a form of PCA, or subspace approximation in a Chi-squared metric. JASA (82) June 1987.
-
Hastie, T. A Closer Look at the Deviance The deviance for GLMs provides the analog in many ways for sum-of-squares in linear regression. This little paper surveys the connections. The American Statistician 41(1), 1987, pp 16-20

1986
-
Hastie, Trevor, and Tibshirani, R. "Generalized Additive Models (with discussion)" 1986 Statistical Science Vol 1, No 3, pages 297-318

1984
-
Principal Curves and Surfaces SLAC (Stanford Linear Accelerator Center) has put up a pdf version of my Ph.D thesis.

Original principal curves and surfaces movie (youtube). -

Generalized Additive Models: the original technical report, written by two PhD students.

1980
-
The Stability of Ordinal Measures of Association in Contingency Tables My University of Cape Town MS thesis.









