ClusterFunction-class: Class ClusterFunction

Description Usage Arguments Details Value Slots Examples

Description

ClusterFunction is a class for holding functions that can be used for clustering in the clustering algorithms in this package.

The constructor clusterFunction creates an object of the class ClusterFunction.

Usage

1
2
3
4
5
6
7
8
internalFunctionCheck(clusterFUN, inputType, algorithmType, outputType)

clusterFunction(clusterFUN, ...)

## S4 method for signature ''function''
clusterFunction(clusterFUN, inputType, outputType,
  algorithmType, inputClassifyType = NA_character_,
  requiredArgs = NA_character_, classifyFUN = NULL, checkFunctions = TRUE)

Arguments

clusterFUN

function bassed to slot clusterFUN.

inputType

character for slot inputType

algorithmType

character for slot inputType

outputType

character for slot outputType

...

arguments passed to different methods of clusterFunction

inputClassifyType

character for slot inputClassifyType

requiredArgs

character for slot requiredArgs

classifyFUN

function for slot classifyFUN

checkFunctions

logical for whether to check the input functions with internalFunctionsCheck

Details

Required arguments for clusterFUN:

algorithmType: Type "01" is for clustering functions that expect as an input a dissimilarity matrix that takes on 0-1 values (e.g. from subclustering) with 1 indicating more dissimilarity between samples. "01" algorithm types must also have inputType equal to "diss". It is also generally expected that "01" algorithms use the 0-1 nature of the input to set criteria as to where to find clusters. "01" functions must take as an argument alpha between 0 and 1 to determine the clusters, where larger values of alpha require less similarity between samples in the same cluster. "K" is for clustering functions that require an argument k (the number of clusters), but arbitrary inputType. On the other hand, "K" algorithms are assumed to need a predetermined 'k' and are also assumed to cluster all samples to a cluster. If not, the post-processing steps in mainClustering such as findBestK and removeSil may not operate correctly since they rely on silhouette distances.

internalFunctionCheck is the function that is called by the validity check of the clusterFunction constructor (if checkFunctions=TRUE). It is available as an S3 function for the user to be able to test their functions and debug them, which is difficult to do with a S4 validity function.

Value

A ClusterFunction object.

Slots

clusterFUN

a function defining the clustering function. See details for required arguments.

inputType

a character defining what type of input clusterFUN takes. Must be one of either "diss","X", or "either"

algorithmType

a character defining what type of clustering algorithm clusterFUN is. Must be one of either "01" or "K". clusterFUN must take the corresponding required arguments (see details below).

classifyFUN

a function that takes as input new data and the output of clusterFUN (when cluster.only=FALSE and results in cluster assignments of the new data. Note that the function should assume that the input 'x' is not the same samples that were input to the clusterFunction (but can assume that it is the same number of features/columns). Used in subsampling clustering. If given value NULL then subsampling can only be "InSample", see subsampleClustering.

inputClassifyType

the input type for the classification function (if not NULL); like inputType, must be one of "diss","X", or "either"

outputType

the type of output given by clusterFUN. Must either be "vector" or "list". If "vector" then the output should be a vector of length equal to the number of observations with integer-valued elements identifying them to different clusters; the vector assignments should be in the same order as the original input of the data. Samples that are not assigned to any cluster should be given a '-1' value. If "list", then it must be a list equal to the length of the number of clusters, and the elements of the list contain the indices of the samples in that cluster. Any indices not in any of the list elements are assumed to be -1. The main advantage of "list" is that it can preserve the order of the clusters if the clusterFUN desires to do so. In which case the orderBy argument of mainClustering can preserve this ordering (default is to order by size).

requiredArgs

Any additional required arguments for clusterFUN (beyond those required of all clusterFUN, described in details).

checkFunctions

logical. If TRUE, the validity check of the ClusterFunction object will check the clusterFUN with simple toy data using the function internalFunctionCheck.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#Use internalFunctionCheck to check possible function
goodFUN<-function(x,diss,k,checkArgs,cluster.only,...){
cluster::pam(x=t(x),k=k,cluster.only=cluster.only)
}
#passes internal check
internalFunctionCheck(goodFUN,inputType="X",algorithmType="K",outputType="vector")
#Note it doesn't pass if inputType="either" because no catches for x=NULL
internalFunctionCheck(goodFUN, inputType="either",algorithmType="K",outputType="vector")
myCF<-clusterFunction(clusterFUN=goodFUN, inputType="X",algorithmType="K", outputType="vector")
badFUN<-function(x,diss,k,checkArgs,cluster.only,...){cluster::pam(x=x,k=k)}
internalFunctionCheck(badFUN,inputType="X",algorithmType="K",outputType="vector")

Bioconductor-mirror/clusterExperiment documentation built on Aug. 2, 2017, 4:28 p.m.