cvxclustr: Convex Clustering via Splitting Methods

Description Details Author(s) References

Description

Clustering is a fundamental problem in science and engineering. Many classic methods such as k-means, Gaussian mixture models, and hierarchical clustering, however, employ greedy algorithms which can be entrapped in local minima, sometimes drastical suboptimal ones at that. Recently introduced convex relaxations of k-means and hierarchical clustering shrink cluster centroids toward one another and ensure a unique global minimizer. This package provides two variable splitting methods

for solving this convex formulation of the clustering problem. We seek the centroids u_i that minimize

\frac{1}{2} ∑_i || x_i - u_i||_2^2 + γ ∑_l w_{l} ||u_{l1} - u_{l2} ||

Two penalty norms are currently supported: 1-norm and 2-norm.

Details

The two main functions are cvxclust_path_admm and cvxclust_path_ama which compute the cluster paths using the ADMM and AMA methods respectively. The function cvxclust is a wrapper function that calls either cvxclust_path_admm or cvxclust_path_ama (the default) to perform the computation.

The functions kernel_weights and knn_weights can be used in sequence to compute weights that can improve the quality of the clustering paths.

The typical usage consists of three steps:

Cluster assignments can also be retrieved from the solution to the convex clustering problem. Both cvxclust_path_admm and cvxclust_path_ama output an object of class cvxclustobject. A cluster assignment can be extracted in two steps:

Author(s)

Eric C. Chi, Kenneth Lange

References

Eric C. Chi and Kenneth Lange. Splitting Methods for Convex Clustering. Journal of Computational and Graphical Statistics, in press. http://arxiv.org/abs/1304.0499.


cvxclustr documentation built on May 2, 2019, 3:44 p.m.