clusterSingle: General wrapper method to cluster the data

Description Usage Arguments Details Value See Also Examples

Description

Given input data, SummarizedExperiment, or ClusterExperiment object, this function will find clusters, based on a single specification of parameters.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## S4 method for signature 'missing,matrixOrNULL'
clusterSingle(x, diss, ...)

## S4 method for signature 'matrixOrNULL,missing'
clusterSingle(x, diss, ...)

## S4 method for signature 'SummarizedExperiment,missing'
clusterSingle(x, diss, ...)

## S4 method for signature 'ClusterExperiment,missing'
clusterSingle(x,
  replaceCoClustering = FALSE, ...)

## S4 method for signature 'matrixOrNULL,matrixOrNULL'
clusterSingle(x, diss, subsample = TRUE,
  sequential = FALSE, mainClusterArgs = NULL, subsampleArgs = NULL,
  seqArgs = NULL, isCount = FALSE, transFun = NULL,
  dimReduce = c("none", "PCA", "var", "cv", "mad"), ndims = NA,
  clusterLabel = "clusterSingle", checkDiss = TRUE)

Arguments

x

the data on which to run the clustering (features in rows), or a SummarizedExperiment, or ClusterExperiment object.

diss

n x n data matrix of dissimilarities between the samples on which to run the clustering.

...

arguments to be passed on to the method for signature matrix.

replaceCoClustering

logical. Applicable if x is a ClusterExperiment object. If TRUE, the co-clustering resulting from subsampling is returned in the coClustering object and replaces any existing coClustering object in the slot coClustering.

subsample

logical as to whether to subsample via subsampleClustering. If TRUE, clustering in mainClustering step is done on the co-occurance between clusterings in the subsampled clustering results. If FALSE, the mainClustering step will be run directly on x/diss

sequential

logical whether to use the sequential strategy (see details of seqCluster). Can be used in combination with subsample=TRUE or FALSE.

mainClusterArgs

list of arguments to be passed for the mainClustering step, see help pages of mainClustering.

subsampleArgs

list of arguments to be passed to the subsampling step (if subsample=TRUE), see help pages of subsampleClustering.

seqArgs

list of arguments to be passed to seqCluster.

isCount

logical. Whether the data are in counts, in which case the default transFun argument is set as log2(x+1). This is simply a convenience to the user, and can be overridden by giving an explicit function to transFun.

transFun

function A function to use to transform the input data matrix before clustering.

dimReduce

character A character identifying what type of dimensionality reduction to perform before clustering. Options are "none","PCA", "var","cv", and "mad". See transform for more details.

ndims

integer An integer identifying how many dimensions to reduce to in the reduction specified by dimReduce

clusterLabel

a string used to describe the clustering. By default it is equal to "clusterSingle", to indicate that this clustering is the result of a call to clusterSingle.

checkDiss

logical. Whether to check whether the input diss is valid.

Details

clusterSingle is an 'expert-oriented' function, intended to be used when a user wants to run a single clustering and/or have a great deal of control over the clustering parameters. Most users will find clusterMany more relevant. However, clusterMany makes certain assumptions about the intention of certain combinations of parameters that might not match the user's intent; similarly clusterMany does not directly take a dissimilarity matrix but only a matrix of values x (though a user can define a distance function to be applied to x in clusterMany).

Unlike clusterMany, most of the relevant arguments for the actual clustering algorithms in clusterSingle are passed to the relevant steps via the arguments mainClusterArgs, subsampleArgs, and seqArgs. These arguments should be named lists with parameters that match the corresponding functions: mainClustering,subsampleClustering, and seqCluster. These functions are not meant to be called by the user, but rather accessed via calls to clusterSingle. But the user can look at the help files of those functions for more information regarding the parameters that they take.

Only certain combinations of parameters are possible for certain choices of sequential and subsample. These restrictions are documented below.

To provide a distance matrix via the argument distFunction, the function must be defined to take the distance of the rows of a matrix (internally, the function will call distFunction(t(x)). This is to be compatible with the input for the dist function. as.matrix will be performed on the output of distFunction, so if the object returned has a as.matrix method that will convert the output into a symmetric matrix of distances, this is fine (for example the class dist for objects returned by dist have such a method). If distFunction=NA, then a default distance will be calculated based on the type of clustering algorithm of clusterFunction. For type "K" the default is to take dist as the distance function. For type "01", the default is to take the (1-cor(x))/2.

Value

A ClusterExperiment object if input was x a matrix (or assay of a ClusterExperiment or SummarizedExperiment object).

If input was diss, then the result is a list with values

See Also

clusterMany to compare multiple choices of parameters, and mainClustering,subsampleClustering, and seqCluster for the underlying functions called by clusterSingle.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data(simData)

## Not run: 
#following code takes some time.
#use clusterSingle to do sequential clustering
#(same as example in seqCluster only using clusterSingle ...)
   clusterFunction="hierarchical01",clusterArgs=list(alpha=0.1)))

## End(Not run)

#use clusterSingle to do just clustering k=3 with no subsampling
clustNothing <- clusterSingle(simData, 
    subsample=FALSE, sequential=FALSE, mainClusterArgs=list(clusterFunction="pam",
    clusterArgs=list(k=3)))
#compare to standard pam
cluster::pam(t(simData),k=3,cluster.only=TRUE)

clusterExperiment documentation built on Nov. 17, 2017, 8:35 a.m.