M3C: Monte Carlo Reference-based Consensus Clustering

Genome-wide data is used to stratify large complex datasets into classes using class discovery algorithms. A widely applied technique is consensus clustering, however; the approach is prone to overfitting and false positives. These issues arise from not considering reference distributions while selecting the number of classes (K). As a solution, we developed Monte Carlo reference-based consensus clustering (M3C). M3C uses a multi-core enabled Monte Carlo simulation to generate null distributions along the range of K which are used to select its value. Using a reference, that maintains the correlation structure of the input features, eliminates the limitations of consensus clustering. M3C uses the Relative Cluster Stability Index (RCSI) and p values to decide on the value of K and reject the null hypothesis, K=1. M3C can also quantify structural relationships between clusters, and uses spectral clustering to deal with non-Gaussian and complex structures. M3C can automatically analyse biological or clinical data with respect to the discovered classes.

Package details

AuthorChristopher John [aut, cre]
Bioconductor views Clustering GeneExpression RNASeq Sequencing Transcription
MaintainerChristopher John <[email protected]>
LicenseAGPL-3
Version1.4.0
Package repositoryView on Bioconductor
Installation Install the latest version of this package by entering the following in R:
source("https://bioconductor.org/biocLite.R")
biocLite("M3C")

Try the M3C package in your browser

Any scripts or data that you put into this service are public.

M3C documentation built on Nov. 1, 2018, 3:52 a.m.