COMMUNAL-package: COmbined Mapping of Multiple clUsteriNg ALgorithms

Description Details Author(s) Examples

Description

This package allows for identification of optimal clustering for a data set. It provides a framework to run a wide range of clustering algorithms to determine the optimal number (k) of clusters in the data. It then provides a function to analyze the cluster assignments from each clustering algorithm to identify samples that repeatedly classify to the same group. We call these 'core clusters,' leading to optimal beds for later class discovery.

Details

Package: COMMUNAL
Type: Package
Version: 1.1
Date: 2015-08-12
License: GPL-2
Imports: clValid, fpc, methods
Depends: R (>= 2.10), cluster
Suggests: RUnit, NMF, ConsensusClusterPlus, rgl

Start with a matrix of data to cluster. Important functions are:
COMMUNAL to run clustering algorithms once
clusterRange to run COMMUNAL across increasing subsets of data
getGoodAlgs to identify robust algorithms from clusterRange
getNonCorrNonMonoMeasures to identify non-monotonic, non-correlated validity measures from clusterRange
plotRange3D to pick k
clusterKeys to identify core clusters
returnCore to identify core clusters

Author(s)

Albert Chen, Timothy E Sweeney, Olivier Gevaert
Maintainer: Albert Chen acc2015@stanford.edu

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
## Not run: 
## create artificial data set with 3 distinct clusters
set.seed(1)
V1 = c(abs(rnorm(100, 2)), abs(rnorm(100, 50)), abs(rnorm(100, 140)))
V2 = c(abs(rnorm(100, 2, 8)), abs(rnorm(100, 55, 4)), abs(rnorm(100, 105, 1)))
data <- t(data.frame(V1, V2))
colnames(data) <- paste("Sample", 1:ncol(data), sep="")
rownames(data) <- paste("Gene", 1:nrow(data), sep="")

## run COMMUNAL
result <- COMMUNAL(data=data, ks=seq(2,5))  # result is a COMMUNAL object
k <- 3                                # suppose optimal cluster number is 3
clusters <- result$getClustering(k)   # method to extract clusters
mat.key <- clusterKeys(clusters)      # get core clusters
examineCounts(mat.key)                # help decide agreement.thresh
core <- returnCore(mat.key, agreement.thresh=50) # find 'core' clusters (all algs agree)
table(core) # the 'core' clusters

## Additional arguments are passed down to clValid, NMF, ConsensusClusterPlus
result <- COMMUNAL(data=data, ks=2:5,
                      clus.methods=c("diana", "ccp-hc", "nmf"), reps=20, nruns=2)

## To identify k, use clusterRange and plotRange3D to visualize validation measures
data(BRCA.100) # 533 tissues to cluster, with measurements of 100 genes each
varRange <- c(10,25,50,75,100)
meas <- c("Connectivity", "average.between",
          "ch", "sindex", "avg.silwidth", 
          "average.within", "dunn", "widestgap",
          "wb.ratio", "entropy", "dunn2", 
          "pearsongamma", "g3", "within.cluster.ss", 
          "min.separation", "max.diameter")
BRCA.results <- clusterRange(BRCA.100, ks=2:6, varRange=varRange, validation=meas)

goodMeasures <- getNonCorrNonMonoMeasures(BRCA.results)
goodAlgs <- getGoodAlgs(BRCA.results)

plot.data <- plotRange3D(BRCA.results, goodAlgs=goodAlgs, goodMeasures = goodMeasures)

## End(Not run)

COMMUNAL documentation built on May 29, 2017, 6:36 p.m.