A central task in genomic data analyses for stratified medicine is class discovery which is accomplished through clustering. However, an unresolved problem with current clustering algorithms is they do not test the null hypothesis and derive p values. To solve this, we developed a novel hypothesis testing framework that uses consensus clustering called Monte Carlo Consensus Clustering (M3C). M3C use a multicore enabled Monte Carlo simulation to generate a distribution of stability scores for each number of clusters using null datasets with the same genegene correlation structure as the real one. These distributions are used to derive p values and a beta distribution is fitted to the data to cheaply estimate p values beyond the limits of the simulation. M3C improves accuracy, allows rejection of the null hypothesis, removes systematic bias, and uses p values to make class number decisions. We believe M3C deals with a major pitfall in current automated class discovery tools.
Package details 


Author  Christopher John [aut, cre] 
Bioconductor views  Clustering GeneExpression RNASeq Sequencing Transcription 
Maintainer  Christopher John <[email protected]> 
License  AGPL3 
Version  1.0.0 
Package repository  View on Bioconductor 
Installation 
Install the latest version of this package by entering the following in R:

Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.