Leukemia dataset (learning set) contains gene expression levels (3051 genes and 38 patient samples) from Golub et al. (1999). This dataset has been pre-processed: capping into floor of 100 and ceiling of 16000; filtering by exclusion of genes with max/min<=5 or max-min<=500, where max and min refer respectively to the maximum and minimum intensities for a particular gene across mRNA samples; 2-base logarithmic transformation.
Golub: a gene expression matrix of 3051 genes x 38
samples. These samples include 11 acute myeloid leukemia (AML) and 27
acute lymphoblastic leukemia (ALL) which can be further subtyped into
19 B-cell ALL and 8 T-cell ALL.
Golub et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, Vol. 286:531-537.