General wrapper method to cluster the data

Share:

Description

Given a data matrix, SummarizedExperiment, or ClusterExperiment object, this function will find clusters, based on a single specification of parameters.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## S4 method for signature 'matrixOrMissing,matrixOrMissing'
clusterSingle(x, diss,
  subsample = TRUE, sequential = FALSE, clusterFunction = c("tight",
  "hierarchical01", "pam", "hierarchicalK"), clusterDArgs = NULL,
  subsampleArgs = NULL, seqArgs = NULL, isCount = FALSE,
  transFun = NULL, dimReduce = c("none", "PCA", "var", "cv", "mad"),
  ndims = NA, clusterLabel = "clusterSingle")

## S4 method for signature 'SummarizedExperiment,missing'
clusterSingle(x, diss, ...)

## S4 method for signature 'ClusterExperiment,missing'
clusterSingle(x, diss, ...)

Arguments

x

the data on which to run the clustering (features in rows).

diss

n x n data matrix of dissimilarities between the samples on which to run the clustering (only if subsample=FALSE)

subsample

logical as to whether to subsample via subsampleClustering to get the distance matrix at each iteration; otherwise the distance function will be determined by argument distFunction passed in clusterDArgs.

sequential

logical whether to use the sequential strategy (see details of seqCluster).

clusterFunction

passed to clusterD option 'clusterFunction' to indicate method of clustering, see clusterD.

clusterDArgs

list of additional arguments to be passed to clusterD.

subsampleArgs

list of arguments to be passed to subsampleClustering.

seqArgs

list of additional arguments to be passed to seqCluster.

isCount

logical. Whether the data are in counts, in which case the default transFun argument is set as log2(x+1). This is simply a convenience to the user, and can be overridden by giving an explicit function to transFun.

transFun

function A function to use to transform the input data matrix before clustering.

dimReduce

character A character identifying what type of dimensionality reduction to perform before clustering. Options are "none","PCA", "var","cv", and "mad". See transform for more details.

ndims

integer An integer identifying how many dimensions to reduce to in the reduction specified by dimReduce

clusterLabel

a string used to describe the clustering. By default it is equal to "clusterSingle", to indicate that this clustering is the result of a call to clusterSingle.

...

arguments to be passed on to the method for signature matrix.

Details

If sequential=TRUE, the sequential clustering controls the 'k' argument of the underlying clustering so setting 'k=' in the list given to clusterDArgs or subsampleArgs will not do anything and will produce a warning to that effect.

Value

A ClusterExperiment object.

See Also

clusterMany to compare multiple choices of parameters.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
data(simData)

## Not run: 
#following code takes some time.
#use clusterSingle to do sequential clustering
#(same as example in seqCluster only using clusterSingle ...)
set.seed(44261)
clustSeqHier_v2 <- clusterSingle(simData, clusterFunction="hierarchical01",
sequential=TRUE, subsample=TRUE, subsampleArgs=list(resamp.n=100, samp.p=0.7,
clusterFunction="kmeans", clusterArgs=list(nstart=10)),
seqArgs=list(beta=0.8, k0=5), clusterDArgs=list(minSize=5))

## End(Not run)

#use clusterSingle to do just clustering k=3 with no subsampling
clustNothing <- clusterSingle(simData, clusterFunction="pam",
subsample=FALSE, sequential=FALSE, clusterDArgs=list(k=3))