Evaluate cluster quality

Share:

Description

Function to evaluate the overall quality of a parition (composed of non-overlapping clusters) by user defined criteria.

Usage

1
measure(parti, dis, X = NULL, method = "g2", maxmiss = 30, ...)

Arguments

parti

Partition to be evaluated.

dis

A square distance matrix or class object of dist corresponding to x.

X

data matrix corresponding to the parti. Columns are assumed to represent the samples, and rows represent the sample's features. Missing values are allowed. This is an optional argument, but If type is set to 'igp', then matrix must be given.

method

Type of evaluation measure to use for assessing the quality of clusters in x. Default is Goodman and Kruskal index g2.

maxmiss

Maximum percentage of missing values per row in dat

...

Arguments for function cluster.stats from the fpc package. See details below.

Details

Numerous cluster quality measuring criteria have been proposed. This package includes only a few well known ones. Except for the 'c.index' and the in group proportion 'igp', rest of the criteria come from the function cluster.stats in fpc package. For latter one, please see the returned arguments of the cluster.stats function before you decide which criteria to choose. Note that, the value returned by different criteria has different meaning. For example. the larger the Goodman and Kruskal index 'g2' the better, for the index G3 'g3' the smaller the better. Thus, interpret returned value accordingly.

Value

A numeric value representing the quality of partition under consideration.

Author(s)

Askar Obulkasim

References

Hennig,C. (2010). fpc: Flexible procedures for clustering, R package, http://CRAN.R-project.org/package=fpc.

Kapp,A.V. and Tibshirani,R. (2007) "Are clusters found in one dataset present in another dataset?", Biostatistics, 8, 9-31.

Obulkasim,A. et al., (2013). "Semi-supervised adaptive-height snipping of the Hierarchical Clustering tree", submitted.

See Also

surv_measure

Examples

1
2
3
4
5
data(BullingerLeukemia)
attach(BullingerLeukemia)
cl <- HCsnipper(em[, 1:30], minclus = 5)
cl <- cl$partitions[cl$id, ]
m <- apply(cl, 1, function(x) measure(parti = x, dis = 1 - cor(em[, 1:30])))