knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" )

clustRviz
aims to enable fast computation and easy visualization of Convex Clustering
solution paths.
You can install clustRviz
from github with:
# install.packages("devtools") devtools::install_github("DataSlingers/clustRviz")
Note that RcppEigen
(which clustRviz
internally) triggers many compiler warnings
(which cannot be suppressed per
CRAN policies).
Many of these warnings can be locally suppressed by adding the line CXX11FLAGS+=-Wno-ignored-attributes
to your ~/.R/Makevars
file. To install an R
package from source, you will need
suitable development tools installed including a C++
compiler and potentially
a Fortran runtime. Details about these toolchains are available on CRAN for
Windows and macOS.
There are two main entry points to the clustRviz
package, the CARP
and CBASS
functions, which perform convex clustering and convex biclustering respectively.
We demonstrate the use of these two functions on a text minining data set,
presidential_speech
, which measures how often the 44 U.S. presidents used certain
words in their public addresses.
library(clustRviz) data(presidential_speech) presidential_speech[1:6, 1:6]
We begin by clustering this data set, grouping the rows (presidents) into clusters:
carp_fit <- CARP(presidential_speech) print(carp_fit)
The algorithmic regularization technique employed by CARP
makes computation of
the whole solution path almost immediate.
We can examine the result of CARP
graphically. We begin with a standard dendrogram,
with three clusters highlighted:
plot(carp_fit, type = "dendrogram", k = 3)
Examing the dendrogram, we see two clear clusters, consisting of pre-WWII and post-WWII presidents and Warren G. Harding as a possible outlier. Harding is generally considered one of the worst US presidents of all time, so this is perhaps not too surprising.
A more interesting visualization is the dynamic path visualization, whereby we can watch the clusters fuse as the regularization level is increased:
plot(carp_fit, type = "path", dynamic = TRUE)
The use of CBASS
for convex biclustering is similar, and we demonstrate it here
with a cluster heatmap, with the regularization set to give 3 observation clusters:
cbass_fit <- CBASS(presidential_speech) plot(cbass_fit, k.row = 3)
By default, plotting the result of CBASS gives the traditional cluster heatmap, but we can also get the row or column dendrograms as well:
plot(cbass_fit, type = "row.dendrogram", k.row = 3)
By default, if a regularization level is specified, all plotting functions in clustRviz
will plot the clustered data. If the regularization level is not specified, the
raw data will be plotted instead:
plot(cbass_fit, type = "heatmap")
More details about the use and mathematical formulation of CARP
and CBASS
may be found in the package documentation.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.