dpmeans: Dirichlet Process K-Means Clustering
In abnormally-distributed/cvreg: Cross Validation and Robust Estimation Utilities

Description Usage Arguments References Examples

This function uses a Bayesian Dirichlet process algorithm presented by Kullis & Jordan (2011) to perform K Means Clustering. Rather than setting a fixed number of clusters as in K-means clustering, the user specifies a concentration parameter τ which controls the precision of a Dirichlet prior on the number of clusters. Higher values of τ lead to a smaller number of clusters, and smaller values lead to a larger number of clusters.

1	dpmeans(data, tau = 2, prior.labels = NULL, max.iter = 500, tolerance = 1e-06)

`data`	a data frame or matrix of numeric variables
`tau`	the concentration parameter. set to higher values to get fewer clusters. the default is 2.
`prior.labels`	a custom vector (character or numeric) or factor with prior cluster labels. this can be manually created, or can be the output of another clustering algorithm. if left as NULL, all observations are initialized in one cluster.
`max.iter`	number of iterations. Defaults to 500.
`tolerance`	tolerance for convegence. defaults to 1e-6

Kullis, B.; Jordan, M. (2011) Revisiting k-means: New Algorithms via Bayesian Nonparametrics. Proceedings of the 29th International Conference on Machine Learning