khan2001: Childhood Cancer Study of Khan et al. (2001)

Description Usage Format Details Source References Examples

Description

Gene expression data (2308 genes for 88 samples) from the microarray study of Khan et al. (2001).

Usage

1

Format

khan2001$x is a 88 x 2308 matrix containing the expression levels. Note that rows correspond to samples, and columns to genes. The row names are the original image IDs, and the column names the orginal probe labels.

khan2001$y is a factor containing the diagnosis for each sample ("BL", "EWS", "NB", "non-SRBCT", "RMS").

khan2001$descr provides some annotation for each gene.

Details

This data set contains measurements of the gene expression of 2308 genes for 88 observations: 29 cases of Ewing sarcoma (EWS), 11 cases of Burkitt lymphoma (BL), 18 cases of neuroblastoma (NB), 25 cases of rhabdomyosarcoma (RMS), and 5 other (non-SRBCT) samples.

Source

The data are described in Khan et al. (2001). Note that the values in khan.data$x are logarithmized (using natural log) for normalization.

References

Khan et al. 2001. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine 7:673–679.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# load sda library
library("sda")

# load full Khan et al (2001) data set
data(khan2001)
dim(khan2001$x) # 88 2308
hist(khan2001$x)
khan2001$y # 5 levels

# data set containing the SRBCT samples
get.srbct = function()
{
  data(khan2001)
  idx = which( khan2001$y == "non-SRBCT" )
  x = khan2001$x[-idx,]
  y = factor(khan2001$y[-idx])
  descr = khan2001$descr[-idx]

  list(x=x, y=y, descr=descr)
}

srbct = get.srbct()
dim(srbct$x)   # 83 2308
hist(srbct$x)
srbct$y # 4 levels

Example output

Loading required package: entropy
Loading required package: corpcor
Loading required package: fdrtool
[1]   88 2308
 [1] EWS       EWS       EWS       EWS       EWS       EWS       EWS      
 [8] EWS       EWS       EWS       EWS       EWS       EWS       EWS      
[15] EWS       EWS       EWS       EWS       EWS       EWS       EWS      
[22] EWS       EWS       BL        BL        BL        BL        BL       
[29] BL        BL        BL        NB        NB        NB        NB       
[36] NB        NB        NB        NB        NB        NB        NB       
[43] NB        RMS       RMS       RMS       RMS       RMS       RMS      
[50] RMS       RMS       RMS       RMS       RMS       RMS       RMS      
[57] RMS       RMS       RMS       RMS       RMS       RMS       RMS      
[64] non-SRBCT non-SRBCT non-SRBCT NB        RMS       non-SRBCT non-SRBCT
[71] NB        EWS       RMS       BL        EWS       RMS       EWS      
[78] EWS       EWS       RMS       BL        RMS       NB        NB       
[85] NB        NB        BL        EWS      
Levels: BL EWS NB non-SRBCT RMS
[1]   83 2308
 [1] EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS EWS
[20] EWS EWS EWS EWS BL  BL  BL  BL  BL  BL  BL  BL  NB  NB  NB  NB  NB  NB  NB 
[39] NB  NB  NB  NB  NB  RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS RMS
[58] RMS RMS RMS RMS RMS RMS NB  RMS NB  EWS RMS BL  EWS RMS EWS EWS EWS RMS BL 
[77] RMS NB  NB  NB  NB  BL  EWS
Levels: BL EWS NB RMS

sda documentation built on Nov. 22, 2021, 1:07 a.m.

Related to khan2001 in sda...