kgs | R Documentation |
Computes the Kelley-Gardner-Sutcliffe penalty function for a hierarchical cluster tree.
kgs (cluster, diss, alpha=1, maxclust=NULL)
cluster |
object of class |
diss |
object of class |
alpha |
weight for number of clusters. |
maxclust |
maximum number of clusters for which to compute measure. |
Kelley et al. (see reference) proposed a method that can help decide where to prune a hierarchical cluster tree. At any level of the tree the mean across all clusters of the mean within clusters of the dissimilarity measure is calculated. After normalizing, the number of clusters times alpha is added. The minimum of this function corresponds to the suggested pruning size.
The current implementation has complexity O(n*n*maxclust), thus very slow with large n. For improvements, at least it should only calculate the spread for clusters that are split at each level, rather than over again for all.
Vector of the penalty function for trees of size 2:maxclust. The names of vector elements are the respective numbers of clusters.
Denis White
Kelley, L.A., Gardner, S.P., Sutcliffe, M.J. (1996) An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally-related subfamilies, Protein Engineering, 9, 1063-1065.
twins.object
,
dissimilarity.object
,
hclust
,
dist
,
clip.clust
,
library (cluster) data (votes.repub) a <- agnes (votes.repub, method="ward") b <- kgs (a, a$diss, maxclust=20) plot (names (b), b, xlab="# clusters", ylab="penalty")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.