HASTIE, T., TIBSHIRANI, R. and FRIEDMAN, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York, 2001. xvi + 533 pp. $74.95/59.00. ISBN 0-387-95284-5.

Data analysis, which not long ago was primarily the domain of statistics, has evolved dramatically in the last few decades. This is almost entirely a consequence of the revolution in computing which has occurred over that period. At the start of this revolution, researchers were enabled to perform analyses that they might previously have balked at - inverting large matrices, estimating parameters using iterative methods, and so on - but analyses which they would nevertheless have contemplated. But gradually things advanced so that nowadays tools can be applied which would be quite inconceivable without machine assistance. This book is essentially about such methods: it describes modern tools for data analysis. Because these modern tools owe so much to computational aspects, it is not surprising that many of the developments described in this book owe at least part of their genesis to disciplines other than statistics: to areas such as machine learning, pattern recognition, data mining, artificial intelligence and computer science in general. Nonetheless, the theoretical glue binding them together is manifestly that of statistics. This book describes that theoretical glue, describing a wide variety of methods, showing their common aspects, their relative merits and their practical properties.

The book distinguishes between supervised and unsupervised learning, though only one of the fourteen chapters is devoted to the latter. In the former, one tries to learn the relationship between some predictor variables and an outcome variable, based on a sample of objects for which both predictors and outcome are known. Classical regression fits into this mould, as do tools such as discriminant analysis, generalized additive models, feed forward neural networks, kernel regression, nearest neighbour methods and support vector machines. All these, and many more, are covered in this book. In contrast, in unsupervised learning, no outcome measurements are available. The aim here is to summarise or model the data in a way convenient for some objective. Cluster analysis, principal components analysis, multidimensional scaling and the discovery of association rules fit into this mould, and these and others are described in the last chapter of the book. The book also includes elegant descriptions of recent but deep statistical ideas, such as model averaging, decomposition into basis functions and model complexity.

Although there is substantial mathematics in the book, it is not a mathematics text - the mathematics is very much used as a vehicle towards an end, and not as an end in itself. This befits the subject matter of modern, and hence computational, statistical tools. However, the book is intended for data analysts or those wishing to learn about the theory behind such modern tools: it does not contain computer code, or even pseudocode. Readers who wish to develop tools based on the methods described here would be expected to be able to translate the mathematics into code themselves. More probably, such readers (and one imagines most readers would be of this kind) would adopt one of the data analytic software languages which already exist (although the book does not give guidance to such languages). There are exercises at the end of each chapter, but these are generally exercises in theory rather than practical data analysis.

This is a beautiful book. Not only in presentation, where it makes excellent use of colour, but also in content and style. It would make a first class text for an advanced undergraduate or an initial graduate course in modern statistical tools, although the theory and methods described in the book should be supplemented by practical work using a language such as S-PLUS or R. The authors, all experts and prime movers in the field, have done a superb job.

Imperial College,
London, U.K.