HCsnipper: HC tree snipper

Description Usage Arguments Details Value Author(s) References Examples

View source: R/HCsnipper.R

Description

This function snips given hierarchical clustering (HC) at variable heights to extract all possible partitions. Each partition (clustering) is composed of non-overlapping clusters.

Usage

1
2
HCsnipper(X, hc = NULL, dis = NULL, dis.method = "cor", link.method = "ward", 
          minclus = 4, maxmiss = 30, ...)

Arguments

X

An object of class ExpressionSet or data matrix from which HC tree will be derived. Columns are assumed to represent the samples, and rows represent the sample's features (genes). Missing values are allowed.

hc

HC tree from which partitions to be extracted. Must be an object class of hclust. This is an optional argument, but if given X and dis will be ignored.

dis

A square distance matrix or object class of dist from which HC tree to be derived. This is an optional argument, if given X will be ignored.

dis.method

The distance measure to be used. This must be one of the methods acceptable for dist function or the Pearson correlation 'cor' (default).

link.method

The agglomeration method to be used. This should be one of "ward" (default), "single", "complete", "average", "mcquitty", "median" or "centroid".

minclus

The minimum number of samples allowed to form a cluster. This parameter is inversely proportional to the number of partitions returned. e.g. large values returns less number clusters, and vice versa.

maxmiss

Maximum percentage of missing values per row in X

...

Arguments for impute.knn from the impute package for missing values imputation in X.

Details

For given HC tree, this function snips it at all possible places to extract partitions under the following conditions:

The last constraint guarantees that sniping does not change the HC tree structure considerably. For example, samples located in far left in the HC tree will not be joined with samples located in far right. The number of partitions return by function depends not only on the minclus argument, but also the shape of the HC tree. Large number of partitions can be returned from a balanced HC tree than a skewed one.

Value

This function returns an object of list class contains following objects:

partitions

a matrix in which rows represent partitions and columns represent samples.

id

indices of the partitions in which minimum cluster size is equal or larger than minclus.

hc

HC tree from which partitions are extracted.

dat

data matrix. If X has missing values, this will be missing values imputed full data matrix.

dis

the distance matrix used

dis.m

the distance measure used

link.m

the agglomeration method used

Author(s)

Askar Obulkasim

References

Obulkasim,A. et al., (2013). "Semi-supervised adaptive-height snipping of the Hierarchical Clustering tree", submitted.

Troyanskaya,O. et al., (2001). "Missing value estimation methods for DNA microarrays". Bioinformatics, 17, 520-525.

Examples

1
2
3
4
5
6
7
8
data(BullingerLeukemia)
attach(BullingerLeukemia)
H <- hclust(as.dist(1 - cor(em[, 1:30])), method = "ward")
cl <- HCsnipper(em[, 1:30], minclus = 5)
cl <- cl$partitions[cl$id, ][1, ]
## Visualize a partition, for this package WGCNA is needed.
#library(WGCNA)
#plotDendroAndColors(H, cl, hang = -1, dendroLabels = FALSE)

HCsnip documentation built on May 31, 2017, 11:33 a.m.