WonM: Weighting on Membership

Description Usage Arguments Value Note Author(s) Examples

View source: R/WonM.R

Description

Weighting on membership is performed with the WonM function. The first step is to compute the appropriate distance matrices for each data source and to use these for hierarchical clustering. This is executed with the agnes function and the ward link. The user may specify a range of values for the number of clusters to cut the resulting dendrograms in. For each value of number of clusters, an incidence matrix is computed and these are added for each data source separately. Eventually, the sums of the incidence matrices are joined together as well, resulting in one consensus matrix. Hierarchical clustering is performed on the consensus matrix to obtain the final clustering result.

Usage

1
2
3
WonM(List,type=c("data","dist","clusters"), distmeasure = c("tanimoto",
"tanimoto"),normalize=FALSE,method=NULL,nrclusters = seq(5, 25, 1), clust =
"agnes", linkage=c("flexible","flexible"),alpha=0.625,StopRange=FALSE)

Arguments

List

A list of matrices of the same type. It is assumed the rows are corresponding with the objects.

type

Type indicates whether the provided matrices in "List" are either data matrices, distance matrices or clustering results obtained from the data. If type="dist" the calculation of the distance matrices is skipped and if type="clusters" the single source clustering is skipped. Type should be one of "data", "dist" or"clusters".

distmeasure

A vector of the distance measures to be used on each data matrix. Should be of "tanimoto", "euclidean", "jaccard","hamming"..

normalize

Logical. Indicates whether to normalize the distance matrices or not. This is recommended if different distance types are used. More details on standardization in Normalization.

method

A method of normalization. Should be one of "Quantile","Fisher-Yates", "standardize","Range" or any of the first letters of these names.

nrclusters

A sequence of the number of clusters to cut the dendrogram in.

clust

Choice of clustering function (character). Defaults to "agnes".

linkage

A vector with the choice of inter group dissimilarity (character) for each data set.

alpha

The parameter alpha to be used in the "flexible" linkage of the agnes function. Defaults to 0.625 and is only used if the linkage is set to "flexible"

StopRange

Logical. Indicates whether the distance matrices with values not between zero and one should be standardized to have so. If FALSE the range normalization is performed. See Normalization. If TRUE, the distance matrices are not changed. This is recommended if different types of data are used such that these are comparable.

Value

The returned value is list with four elements:

DistM

A list with the distance matrix for each data structure

ClustSep

The hierarchical clustering result on each data set

Consensus

The computed consensus matrix over all data sources

Clust

The resulting clustering

Note

For now, only hierarchical clustering with the agnes function is implemented.

Author(s)

Marijke Van Moerbeke

Examples

1
2
3
4
5
6
7
data(fingerprintMat)
data(targetMat)
L=list(fingerprintMat,targetMat)

MCF7_WonM=WonM(L,type="data",distmeasure=c("tanimoto","tanimoto"),normalize=FALSE,
method=NULL,nrclusters=seq(5,25),clust="agnes",linkage=c("flexible","flexible"),
alpha=0.625,StopRange=FALSE)

IntClust documentation built on May 2, 2019, 5:23 p.m.