KMeansPP | R Documentation |
k-means++ clustering \insertCiteArthur2007TreeDist improves the speed and
accuracy of standard kmeans
clustering
\insertCiteHartigan1979TreeDist by preferring initial cluster centres
that are far from others.
A scalable version of the algorithm has been proposed for larger data sets
\insertCiteBahmani2012TreeDist, but is not implemented here.
KMeansPP(x, k = 2, nstart = 10, ...)
x |
Numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns). |
k |
Integer specifying the number of clusters, k. |
nstart |
Positive integer specifying how many random sets should be chosen |
... |
additional arguments passed to |
Martin R. Smith (martin.smith@durham.ac.uk)
kmeans
Other cluster functions:
cluster-statistics
# Generate random points
set.seed(1)
x <- cbind(c(rnorm(10, -5), rnorm(5, 1), rnorm(10, 6)),
c(rnorm(5, 0), rnorm(15, 4), rnorm(5, 0)))
# Conventional k-means may perform poorly
klusters <- kmeans(x, cent = 5)
plot(x, col = klusters$cluster, pch = rep(15:19, each = 5))
# Here, k-means++ recovers a better clustering
plusters <- KMeansPP(x, k = 5)
plot(x, col = plusters$cluster, pch = rep(15:19, each = 5))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.