Description Details Author(s) References
Clustering is a fundamental problem in science and engineering. Many classic methods such as k-means, Gaussian mixture models, and hierarchical clustering, however, employ greedy algorithms which can be entrapped in local minima, sometimes drastical suboptimal ones at that. Recently introduced convex relaxations of k-means and hierarchical clustering shrink cluster centroids toward one another and ensure a unique global minimizer. This package provides two variable splitting methods
Alternating Method of Multipliers (ADMM)
Alternating Minimization Algorithm (AMA)
for solving this convex formulation of the clustering problem. We seek the centroids u_i that minimize
\frac{1}{2} ∑_i || x_i - u_i||_2^2 + γ ∑_l w_{l} ||u_{l1} - u_{l2} ||
Two penalty norms are currently supported: 1-norm and 2-norm.
The two main functions are cvxclust_path_admm and cvxclust_path_ama which compute the cluster paths using
the ADMM and AMA methods respectively. The function cvxclust is a wrapper function that calls either
cvxclust_path_admm or cvxclust_path_ama (the default) to perform the computation.
The functions kernel_weights and knn_weights can be used in sequence
to compute weights that can improve the quality of the clustering paths.
The typical usage consists of three steps:
Compute weights w.
Generate a geometrically increasing regularization parameter sequence. Unfortunately a closed form expression for the minimum amount of penalization to get complete coalescence is currently unknown.
Call cvxclust using the data X, weights w, and regularization parameter sequence gamma.
Cluster assignments can also be retrieved from the solution to the convex clustering problem.
Both cvxclust_path_admm and cvxclust_path_ama output an object of class cvxclustobject.
A cluster assignment can be extracted in two steps:
Call create_adjacency to construct an adjacency matrix from the centroid differences variable V.
Call find_clusters to extract the connected components of the adjacency matrix.
Eric C. Chi, Kenneth Lange
Eric C. Chi and Kenneth Lange. Splitting Methods for Convex Clustering. Journal of Computational and Graphical Statistics, in press. http://arxiv.org/abs/1304.0499.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.