Datasets for "The Elements of Statistical Learning"


14-cancer microarray data: Info   Training set gene expression , Training set class labels , Test set gene expression , Test set class labels .
The indices in the cross-validation folds used in Sec 18.3 are listed in CV folds.

Bone Mineral Density: Info   Data   Larger dataset with ethnicity included: spnbmd.csv

Countries: Info   Data

Galaxy: Info   Data

Los Angeles Ozone: info   Data

Marketing: Info   Data

Mixture Simulation: Info   Data

NCI (microarray): Info   Data (csv)   Labels (text)

NMR: Data is from Wavelab   (csv) nmr1.csv

Ozone: Info   Data

Phoneme: Info   Data

Prostate: Info   Data

Protein flow cytometry data: Info   Data
Covariance matrix

Radiation sensitivity data: Info   gene expression data
outcome

SRBCT microarray data: Info   Training set gene expression , Training set class labels , Test set gene expression , Test set class labels

Signatures data: Info   Data

Skin of the Orange (Section 12.3.4): Info   Data

South African Heart Disease: Info   Data

Spam: Info   Data and test set Indicator
For more informations, see the UCI spambase directory.

Vowel: Info,   Training and Test data.

Waveform: Info,   Training and Test data, and a generating function waveform.S (Splus or R).

ZIP code: Info,   gzipped Training and Test data.
Since the training data are somewhat large, you can access the digits separately.