Datasets for "The Elements of Statistical Learning"
14-cancer microarray data: Info
Training set gene expression ,
Training set class labels ,
Test set gene expression ,
Test set class labels .
The indices in the cross-validation folds used in Sec 18.3
are listed in
CV folds.
Bone Mineral Density: Info
Data Larger dataset with
ethnicity included: spnbmd.csv
Countries: Info
Data
Galaxy: Info
Data
Los Angeles Ozone: info
Data
Marketing: Info
Data
Mixture Simulation: Info
Data
NCI (microarray): Info
Data (csv)
Labels (text)
NMR: Data is
from Wavelab
(csv) nmr1.csv
Ozone: Info
Data
Phoneme: Info
Data
Prostate: Info
Data
Protein flow cytometry data: Info
Data
Covariance matrix
Radiation sensitivity data: Info
gene expression data
outcome
SRBCT microarray data: Info
Training set gene expression ,
Training set class labels ,
Test set gene expression ,
Test set class labels
Signatures data: Info
Data
Skin of the Orange (Section 12.3.4): Info
Data
South African Heart Disease: Info
Data
Spam: Info
Data and test set
Indicator
For more informations, see the
UCI spambase directory.
Vowel: Info,
Training and
Test data.
Waveform: Info,
Training and
Test data, and a generating
function waveform.S (Splus or R).
ZIP code: Info, gzipped
Training and
Test data.
Since the training
data are somewhat large, you can access the digits separately.