# kmeanspp: K-Means++ Clustering Algorithm In maotai: Tools for Matrix Algebra, Optimization and Inference

## Description

k-means++ algorithm is known to be a smart, careful initialization technique. It is originally intended to return a set of k points as initial centers though it can still be used as a rough clustering algorithm by assigning points to the nearest points.

## Usage

 1 kmeanspp(data, k = 2) 

## Arguments

 data an (n\times p) matrix whose rows are observations. k the number of clusters.

## Value

a length-n vector of class labels.

## References

\insertRef

arthur_kmeans_2007maotai

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ## use simple example of iris dataset data(iris) mydata = as.matrix(iris[,1:4]) mycol = as.factor(iris[,5]) ## find the low-dimensional embedding for visualization my2d = cmds(mydata, ndim=2)\$embed ## apply 'kmeanspp' with different numbers of k's. k2 = kmeanspp(mydata, k=2) k3 = kmeanspp(mydata, k=3) k4 = kmeanspp(mydata, k=4) k5 = kmeanspp(mydata, k=5) k6 = kmeanspp(mydata, k=6) ## visualize opar <- par(no.readonly=TRUE) par(mfrow=c(2,3)) plot(my2d, col=k2, main="k=2", pch=19, cex=0.5) plot(my2d, col=k3, main="k=3", pch=19, cex=0.5) plot(my2d, col=k4, main="k=4", pch=19, cex=0.5) plot(my2d, col=k5, main="k=5", pch=19, cex=0.5) plot(my2d, col=k6, main="k=6", pch=19, cex=0.5) plot(my2d, col=mycol, main="true cluster", pch=19, cex=0.5) par(opar) 

maotai documentation built on Oct. 25, 2021, 9:06 a.m.