Ckmeans.1d.dp: Optimal, Fast, and Reproducible Univariate Clustering

Fast, optimal, and reproducible weighted univariate clustering by dynamic programming. Four problems are solved, including univariate k-means (Wang & Song 2011) <doi:10.32614/RJ-2011-015> (Song & Zhong 2020) <doi:10.1093/bioinformatics/btaa613>, k-median, k-segments, and multi-channel weighted k-means. Dynamic programming is used to minimize the sum of (weighted) within-cluster distances using respective metrics. Its advantage over heuristic clustering in efficiency and accuracy is pronounced when there are many clusters. Multi-channel weighted k-means groups multiple univariate signals into k clusters. An auxiliary function generates histograms adaptive to patterns in data. This package provides a powerful set of tools for univariate data analysis with guaranteed optimality, efficiency, and reproducibility, useful for peak calling on temporal, spatial, and spectral data.

Package overview README.md Note: Weight scaling in cluster analysis Tutorial: Adaptive versus regular histograms Tutorial: Optimal univariate clustering

Vignettes Man pages API and functions Files

Package details
Author	Joe Song [aut, cre] (<https://orcid.org/0000-0002-6883-6547>), Hua Zhong [aut] (<https://orcid.org/0000-0003-1962-2603>), Haizhou Wang [aut]
Maintainer	Joe Song <joemsong@cs.nmsu.edu>
License	LGPL (>= 3)
Version	4.3.5
Package repository	View on CRAN
Installation	Install the latest version of this package by entering the following in R: `install.packages("Ckmeans.1d.dp")`