knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README_figs/" )
This R package implements hierarchical clustering with minimax linkage (a.k.a., prototype clustering). See Bien, J. and Tibshirani, R. (2011) Hierarchical Clustering with Prototypes via Minimax Linkage for more details.
Since protoclust
is on CRAN, it can be easily installed using
install.packages("protoclust")
in R. Occasionally, this github version of the package will be more up-to-date. The easiest way to install this version is
by using the devtools R package (if not already installed, open R and type install.packages("devtools")
). To install protoclust
, type
devtools::install_github("jacobbien/protoclust")
in R.
Suppose we have some data.
set.seed(1) n <- 100 p <- 2 x <- matrix(rnorm(n * p), n, p) rownames(x) <- paste("A", 1:n, sep = "") d <- dist(x) par(pty = "s") plot(x, type = "n", xlab = "", ylab = "") points(x, pch = 20, col = "lightblue")
After computing an n-by-n matrix of dissimilarities d
, we can apply prototype clustering.
# perform minimax linkage clustering: library(protoclust) hc <- protoclust(d) # cut the tree at height 1: h <- 1 cut <- protocut(hc, h = h) # plot dendrogram (and show cut): plot(hc, imerge = cut$imerge, col=2) abline(h = h, lty = 2)
This gives us r length(cut$protos)
prototypes with the guarantee that each of the n points is no more than r h
unit from one of the prototypes.
# get the prototype assigned to each point: pr <- cut$protos[cut$cl] # find point farthest from its prototype: dmat <- as.matrix(d) ifar <- which.max(dmat[cbind(1:n, pr[1:n])]) # since this is a 2d example, make 2d display: par(pty = "s") plot(x, type="n", xlab = "", ylab = "") points(x, pch=20, col="lightblue") lines(rbind(x[ifar, ], x[pr[ifar], ]), col=3) points(x[cut$protos, ], pch=20, col="red") text(x[cut$protos, ], labels=hc$labels[cut$protos], pch=19) tt <- seq(0, 2 * pi, length=100) for (i in cut$protos) { lines(x[i, 1] + h * cos(tt), x[i, 2] + h * sin(tt)) }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.