The goal of fusionclust is to conduct clustering and feature screening in large scale cluster analysis problems. In particular, fusionclust provides the Big Merge Tracker (BMT) and COSCI algorithms for convex clustering and feature screening using an ℓ1 fusion penalty.

BMT is a computationally efficient path algorithm that relies on a convex relaxation of the k-means clustering criterion and is potent at determining the number of clusters / modes in an univariate problem. COSCI (COnvex Screening for Cluster Information), on the other hand, is a non-parametric method for ranking and screening non-informative features in large scale cluster analysis problems and enjoys a perfect screening property in the sense that under mild regularity conditions on the densities of the features, COSCI screens out all the non-informative features with high probability.

  1. install the release version of fusionclust from CRAN with install.packages("fusionclust").

  2. install the development version of fusionclust

R devtools::install_github("trambakbanerjee/fusionclust")


Check out the included vignette demo-fusionclust for illustrative examples.


[1.] Feature Screening in Large Scale Cluster Analysis Banerjee, T., Mukherjee, G. and Radchenko P. *Journal of Multivariate Analysis, Volume 161, 2017, Pages 191-212

[2.] Convex clustering via ℓ1 fusion penalization Radchenko P., Mukherjee G. J. R. Stat. Soc. Ser. B Stat. Methodol. (2017)

