kgs: KGS Measure for Pruning Hierarchical Clusters

kgsR Documentation

KGS Measure for Pruning Hierarchical Clusters

Description

Computes the Kelley-Gardner-Sutcliffe penalty function for a hierarchical cluster tree.

Usage

  kgs (cluster, diss, alpha=1, maxclust=NULL)

Arguments

cluster

object of class hclust or twins.

diss

object of class dissimilarity or dist.

alpha

weight for number of clusters.

maxclust

maximum number of clusters for which to compute measure.

Details

Kelley et al. (see reference) proposed a method that can help decide where to prune a hierarchical cluster tree. At any level of the tree the mean across all clusters of the mean within clusters of the dissimilarity measure is calculated. After normalizing, the number of clusters times alpha is added. The minimum of this function corresponds to the suggested pruning size.

The current implementation has complexity O(n*n*maxclust), thus very slow with large n. For improvements, at least it should only calculate the spread for clusters that are split at each level, rather than over again for all.

Value

Vector of the penalty function for trees of size 2:maxclust. The names of vector elements are the respective numbers of clusters.

Author(s)

Denis White

References

Kelley, L.A., Gardner, S.P., Sutcliffe, M.J. (1996) An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally-related subfamilies, Protein Engineering, 9, 1063-1065.

See Also

twins.object, dissimilarity.object, hclust, dist, clip.clust,

Examples

  library (cluster)
  data (votes.repub)

  a <- agnes (votes.repub, method="ward")
  b <- kgs (a, a$diss, maxclust=20)
  plot (names (b), b, xlab="# clusters", ylab="penalty")

maptree documentation built on April 6, 2022, 5:09 p.m.

Related to kgs in maptree...