mainClustering | R Documentation |
Given input data, this function will try to find the clusters based on the given ClusterFunction object.
## S4 method for signature 'character'
mainClustering(clusterFunction, ...)
## S4 method for signature 'ClusterFunction'
mainClustering(
clusterFunction,
inputMatrix,
inputType,
clusterArgs = NULL,
minSize = 1,
orderBy = c("size", "best"),
format = c("vector", "list"),
returnData = FALSE,
warnings = TRUE,
...
)
## S4 method for signature 'ClusterFunction'
getPostProcessingArgs(clusterFunction)
clusterFunction |
a |
... |
arguments passed to the post-processing steps of the clustering.
The available post-processing arguments for a |
inputMatrix |
numerical matrix on which to run the clustering or a
|
inputType |
a character vector defining what type of input is given in
the |
clusterArgs |
arguments to be passed directly to the |
minSize |
the minimum number of samples in a cluster. Clusters found below this size will be discarded and samples in the cluster will be given a cluster assignment of "-1" to indicate that they were not clustered. |
orderBy |
how to order the cluster (either by size or by maximum alpha
value). If orderBy="size" the numbering of the clusters are reordered by
the size of the cluster, instead of by the internal ordering of the
|
format |
whether to return a list of indices in a cluster or a vector of clustering assignments. List is mainly for compatibility with sequential part. |
returnData |
logical as to whether to return the |
warnings |
logical as to whether should give warning if arguments given that don't match clustering choices given. Otherwise, inapplicable arguments will be ignored without warning. |
mainClustering
is not meant to be called by the user. It is
only an exported function so as to be able to clearly document the
arguments for mainClustering
which can be passed via the argument
mainClusterArgs
in functions like clusterSingle
and
clusterMany
.
Post-processing Arguments: For post-processing the clustering, currently only type 'K' algorithms have a defined post-processing. Specifically
"findBestK"logical, whether should find best K based on average silhouette width (only used if clusterFunction of type "K").
"kRange"vector of integers to try for k values if findBestK=TRUE. If
k
is given in clusterArgs
, then default is k-2 to k+20,
subject to those values being greater than 2; if not the default is
2:20
. Note that default values depend on the input k, so running for
different choices of k and findBestK=TRUE can give different answers unless
kRange is set to be the same.
"removeSil"logical as to whether remove the assignment of a sample
to a cluster when the sample's silhouette value is less than
silCutoff
"silCutoff"Cutoff on the minimum silhouette width to be included in cluster (only used if removeSil=TRUE).
If returnData=FALSE
, mainClustering returns a vector of cluster assignments (if
format="vector") or a list of indices for each cluster (if format="list").
Clusters less than minSize are removed. If returnData=TRUE
, then mainClustering returns a list
resultsThe clusterings of each sample.
inputMatrixThe input matrix given to argument inputMatrix
. Useful if input is result of subsampling, in which case input is the set of clusterings found over subsampling.
data(simData)
cl1<-mainClustering(inputMatrix=simData, inputType="X",
clusterFunction="pam",clusterArgs=list(k=3))
#supply a dissimilarity, use algorithm type "01"
diss<-as.matrix(dist(t(simData),method="manhattan"))
cl2<-mainClustering(diss, inputType="diss", clusterFunction="hierarchical01",
clusterArgs=list(alpha=.1))
cl3<-mainClustering(inputMatrix=diss, inputType="diss", clusterFunction="pam",
clusterArgs=list(k=3))
# run hierarchical method for finding blocks, with method of evaluating
# coherence of block set to evalClusterMethod="average", and the hierarchical
# clustering using single linkage:
# (clustering function requires type 'diss'),
clustSubHier <- mainClustering(diss, inputType="diss",
clusterFunction="hierarchical01", minSize=5,
clusterArgs=list(alpha=0.1,evalClusterMethod="average", method="single"))
#post-process results of pam -- must pass diss for silhouette calculation
clustSubPamK <- mainClustering(simData, inputType="X", clusterFunction="pam",
silCutoff=0, minSize=5, diss=diss, removeSil=TRUE, clusterArgs=list(k=3))
clustSubPamBestK <- mainClustering(simData, inputType="X", clusterFunction="pam", silCutoff=0,
minSize=5, diss=diss, removeSil=TRUE, findBestK=TRUE, kRange=2:10)
# note that passing the wrong arguments for an algorithm results in warnings
# (which can be turned off with warnings=FALSE)
clustSubTight_test <- mainClustering(diss, inputType="diss",
clusterFunction="tight",
clusterArgs=list(alpha=0.1), minSize=5, removeSil=TRUE)
clustSubTight_test2 <- mainClustering(diss, inputType="diss",
clusterFunction="tight",
clusterArgs=list(alpha=0.1,evalClusterMethod="average"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.