protoclust | R Documentation |

Performs minimax linkage hierarchical clustering given a set of
dissimilarities. Returns an object that looks just like the output of
`hclust`

except that it has an additional element containing prototype
indices.

protoclust(d, verb = FALSE)

`d` |
dissimilarities object. Can be of class |

`verb` |
see verbose output? |

This function provides an efficient implementation of minimax linkage hierarchical clustering. Consider two clusters G and H and their union U. The minimax linkage between G and H is defined to be the radius of the smallest ball that encloses all of U and that is centered at one of the points in U. If G and H are merged together, the prototype for the newly formed cluster U is that enclosing ball's center. By construction, the prototype for a cluster will always be one of the objects being clustered. For more on minimax linkage and how one can use prototypes to help interpret a dendrogram, see

An object of class `protoclust`

, which is just like
`hclust`

but has an additional element:

```
merge, height,
order
``` |
identical to the values returned by |

`protos` |
a vector of length n - 1. The i-th element is the index of the prototype corresponding to the cluster formed on the i-th merge. |

Jacob Bien and Rob Tibshirani

Bien, J., and Tibshirani, R. (2011), "Hierarchical Clustering
with Prototypes via Minimax Linkage," *The Journal of the American
Statistical Association*, 106(495), 1075-1084.

This function has been designed to work like `hclust`

in terms of
inputs and outputs; however, unlike `hclust`

, it outputs an additional
element, namely a vector of length n - 1 containing the indices of
prototypes. It follows `hclust`

's convention for making the arbitrary
choice of whether to put a subtree on the left or right side.

For cutting a minimax linkage hierarchical clustering, use
`protocut`

, which works like `cutree`

except that it
returns the set of prototypes in addition to the cluster assignments.

This function calls a C implementation of the algorithm detailed in Bien and Tibshirani (2011) that is based on an algorithm described in Murtagh (1983).

Bien, J., and Tibshirani, R. (2011), "Hierarchical Clustering
with Prototypes via Minimax Linkage," *The Journal of the American
Statistical Association*, 106(495), 1075-1084.

Murtagh, F. (1983), "A Survey of Recent Advances in Hierarchical Clustering
Algorithms," *The Computer Journal*, **26**, 354–359.

`protocut`

, `plotwithprototypes`

,
`hclust`

# generate some data: set.seed(1) n <- 100 p <- 2 x <- matrix(rnorm(n * p), n, p) rownames(x) <- paste("A", 1:n, sep="") d <- dist(x) # perform minimax linkage clustering: hc <- protoclust(d) # cut the tree to yield a 10-cluster clustering: k <- 10 # number of clusters cut <- protocut(hc, k=k) h <- hc$height[n - k] # plot dendrogram (and show cut): plotwithprototypes(hc, imerge=cut$imerge, col=2) abline(h=h, lty=2) # get the prototype assigned to each point: pr <- cut$protos[cut$cl] # find point farthest from its prototype: dmat <- as.matrix(d) ifar <- which.max(dmat[cbind(1:n, pr[1:n])]) # note that this distance is exactly h: stopifnot(dmat[ifar, pr[ifar]] == h) # since this is a 2d example, make 2d display: plot(x, type="n") points(x, pch=20, col="lightblue") lines(rbind(x[ifar, ], x[pr[ifar], ]), col=3) points(x[cut$protos, ], pch=20, col="red") text(x[cut$protos, ], labels=hc$labels[cut$protos], pch=19) tt <- seq(0, 2 * pi, length=100) for (i in cut$protos) { lines(x[i, 1] + h * cos(tt), x[i, 2] + h * sin(tt)) }

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.