05-HCSP: Combining Hierarchical Clustering with SillyPutty

HCSPR Documentation

Combining Hierarchical Clustering with SillyPutty

Description

Our simulations revealed that the fastest and most accuirate clustering algorithm for modest-sized contiuous data sets is the combination of hierarchical clustering (with Ward's linkage rule) followed by SillyPutty. The function HCSP implements this combination.

Usage

  HCSP(dis, K, method = "ward.D2", ...)

Arguments

dis

An object of class dist representing a distance matrix.

K

The desired number of clusters.

method

Sane as the corresponding argument for hclust. We recommend not changing it.

...

Extra arguments to the SillyPutty function.

Details

The HCSP function that first runs hierarchical clustering, then applies the SillyPutty algorithm.

Value

A list containing two items: hc, the results of hierarchical clustering, and sp, a SillyPutty object by applying the algorithm to the result of cutting the dendrogram to produce K clusters.

Author(s)

Kevin R. Coombes krc@silicovore.com

References

Polina Bombina, Dwayne Tally, Zachary B. Abrams, Kevin R. Coombes. SillyPutty: Improved clustering by optimizing the silhouette width, bioRxiv 2023.11.07.566055; doi: https://doi.org/10.1101/2023.11.07.566055

Examples

data(eucdist)
set.seed(1234)
twostep <- HCSP(eucdist, K=5)
sw <- cluster::silhouette(twostep$sp@cluster, eucdist)
plot(sw)

SillyPutty documentation built on Feb. 8, 2024, 3 a.m.