Description Usage Arguments Details Value Author(s) References Examples

subtype performs a biclustering procedure on a input dataset and assess whether resulting clusters are promising subtypes.

1 | ```
subtype(GEset, outcomeLabels, treatment=NULL, Npermutes=10, Nchunks = 25, minClusterSizeB = 20, NclustersASet = 100, FDRpermutation = TRUE, nFDRperm = 50, seed = NULL, testMode="quick",survivaltimes=NULL,method="penalized", top_best_probes=100, Niter=20, showMovie=0, redefineSubtypeMembers=0,holdOut=10 )
``` |

`GEset` |
p-by-n data matrix, where p is the number of variables (e.g. genes) and n is the number of subjects. Row and column names are necessary. |

`outcomeLabels` |
n-by-1 vector. Binary prognosis labels assigned to the subjects. The order of subjects should be equalized to that of GEset. |

`treatment` |
NULL. |

`Npermutes` |
Number of permutations for the variables. For each permutation, the variables belong to different chunks. |

`Nchunks` |
Number of chunks of the variables. When the number of variables is too large for clustering analysis, we split the variables into several(=Nchunks) chunks. |

`minClusterSizeB` |
The minimum number of subjects per each selected subtype. The default is 20. |

`NclustersASet` |
Cut a tree from hierarchical clustering into several groups. The default is 100. |

`FDRpermutation` |
Determine whether FDR computation is based on permutation procedure. The default is TRUE. |

`nFDRperm` |
Number of permutation to compute FDR. The default is 50. |

`seed` |
seed number for reproducibility. |

`testMode` |
the mode is fixed at "quick". |

`survivaltimes` |
NULL. |

`method` |
penalized is used. |

`top_best_probes` |
top-ranked probes are used in t-test, and this is input for penalized. The default is 100. |

`Niter` |
The number of iterations of (TrainingSet, TestSet)->training->test->recordResults . The defualt is 20. |

`showMovie` |
display RUC/Surv curves and heatmaps. The default is 0. |

`redefineSubtypeMembers` |
detect subtype members after every hold-out. The defualt is 0. |

`holdOut` |
out of the subtype, i.e. Nsubtype - holdOut = Ntraining_set. The defualt is 10. |

This implements a biclustering algorithm to find hidden subtypes in a dataset. summary provides a measure based on FDR and its p-value for assessing the subtypes. Note that the R-package rsmooth should be installed before implementing subtype. rsmooth can be downloaded from http://www.meb.ki.se/~yudpaw. For large dataset, the computation can be heavy, so it is desirable for users to consider parallel processing in R.

resultsAll: | a matrix including subtypeID and summary statistics for each subtypeID. For a specific subtypeID, it includes the number of genes, the number of subjects, area of low p-values (low_pValue_Area). |

GenesDefiningSubtypes: | Variables in each subtypeID. This can be identified with "subtypeID". |

SubtypePatients: | Subjects in each subtypeID. This can be identified with subtypeID. |

Andrey Alexeyenko, Woojoo Lee (maintainer:[email protected]) and Yudi Pawitan

Alexeyenko, A. et al. (2011) Estimation of false discovery rate in a heterogeneous population.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ```
set.seed(1234)
p<-100 #num.variables
n1<-5 #number of sample in population 1
n2<-5 #num.samples from population 2
group<-c(rep(1,length.out=n1),rep(2,length.out=n2))
data<-matrix(rnorm((n1+n2)*p),(n1+n2),p)
############################
dimnames(data)[[1]]<-as.character(paste("P",runif(nrow(data),0,1),sep="")) ### making row names
dimnames(data)[[2]]<-as.character(paste("G",runif(ncol(data),0,1),sep="")) ### making column names
### The following procedure takes ~ 1 minute.
A=subtype(
GEset = t(data),
outcomeLabels = group,
Npermutes = 2,
Nchunks = 5,
NclustersASet = 3,
seed=1234
)
summary(A,f.out=0) ### f.out can be used for filtering out uninteresting subtypes. e.g. if f.out=2, we ignore subtypes having N01_0<=2.
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.