Find sets of samples that stay together across clusterings
Description
Find sets of samples that stay together across clusterings in order to define a new clustering vector.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13  ## S4 method for signature 'matrix,missing'
combineMany(x, whichClusters, proportion = 1,
clusterFunction = "hierarchical01", propUnassigned = 0.5, minSize = 5)
## S4 method for signature 'ClusterExperiment,numeric'
combineMany(x, whichClusters,
eraseOld = FALSE, clusterLabel = "combineMany", ...)
## S4 method for signature 'ClusterExperiment,character'
combineMany(x, whichClusters, ...)
## S4 method for signature 'ClusterExperiment,missing'
combineMany(x, whichClusters, ...)

Arguments
x 
a matrix or 
whichClusters 
a numeric or character vector that specifies which clusters to compare (missing if x is a matrix) 
proportion 
The proportion of times that two sets of samples should be together in order to be grouped into a cluster (if <1, passed to clusterD via alpha = 1  proportion) 
clusterFunction 
the clustering to use (passed to

propUnassigned 
samples with greater than this proportion of assignments equal to '1' are assigned a '1' cluster value as a last step (only if proportion < 1) 
minSize 
minimum size required for a set of samples to be considered in
a cluster because of shared clustering, passed to 
eraseOld 
logical. Only relevant if input 
clusterLabel 
a string used to describe the type of clustering. By default it is equal to "combineMany", to indicate that this clustering is the result of a call to combineMany. However, a more informative label can be set (see vignette). 
... 
arguments to be passed on to the method for signature

Details
The function tries to find a consensus cluster across many different
clusterings of the same samples. It does so by creating a nSamples
x
nSamples
matrix of the percentage of cooccurance of each sample and
then calling clusterD to cluster the cooccurance matrix. The function
assumes that '1' labels indicate clusters that are not assigned to a
cluster. Cooccurance with the unassigned cluster is treated differently
than other clusters. The percent cooccurance is taken only with respect to
those clusterings where both samples were assigned. Then samples with more
than propUnassigned
values that are '1' across all of the
clusterings are assigned a '1' regardless of their cluster assignment.
The method calls clusterD
on the proportion matrix with
clusterFunction
as the 01 clustering algorithm, alpha=1proportion
,
minSize=minSize
, and evalClusterMethod=c("average")
. See help of
clusterD
for more details.
Value
If x is a matrix, a list with values
clustering
vector of cluster assignments, with "1" implying unassignedpercentageShared
a nSample x nSample matrix of the percent cooccurance across clusters used to find the final clusters. Percentage is out of those not '1'noUnassignedCorrection
a vector of cluster assignments before samples were converted to '1' because had >propUnassigned
'1' values (i.e. the direct output of theclusterD
output.)
If x is a ClusterExperiment
, a
ClusterExperiment
object, with an added clustering of
clusterTypes equal to combineMany
and the percentageShared
matrix stored in the coClustering
slot.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  data(simData)
cl < clusterMany(simData,nPCADims=c(5,10,50), dimReduce="PCA",
clusterFunction="pam", ks=2:4, findBestK=c(FALSE), removeSil=TRUE,
subsample=FALSE)
#make names shorter for plotting
clMat < clusterMatrix(cl)
colnames(clMat) < gsub("TRUE", "T", colnames(clMat))
colnames(clMat) < gsub("FALSE", "F", colnames(clMat))
colnames(clMat) < gsub("k=NA,", "", colnames(clMat))
#require 100% agreement  very strict
clCommon100 < combineMany(clMat, proportion=1, minSize=10)
#require 70% agreement based on clustering of overlap
clCommon70 < combineMany(clMat, proportion=0.7, minSize=10)
oldpar < par()
par(mar=c(1.1, 12.1, 1.1, 1.1))
plotClusters(cbind("70%Similarity"=clCommon70$clustering, clMat,
"100%Similarity"=clCommon100$clustering), axisLine=2)
#method for ClusterExperiment object
clCommon < combineMany(cl, whichClusters="workflow", proportion=0.7,
minSize=10)
plotClusters(clCommon)
par(oldpar)
