seeding: finding initial cluster centers

Description Usage Arguments Examples

Description

A fast seeding algorithm proposing to approximate k-means++ using Markov chain Monte Carlo (MCMC).

Usage

1
seeding(X, num_seeds, m, threads)

Arguments

X

a numeric matrix or data frame

num_seeds

number of seeds

m

an integer, length of Markove chains, which is used to sample centers, 20 by default

threads

an integer, number of threads to speed up computing

Examples

1
2
3
4
5
6
data(iris)
iris <- iris[sample(1:nrow(iris), 5000, replace = T),]
X <- as.matrix(iris[,1:4])
seeds <- seeding(X, 3, 20, 2)
clus1 <- kmeans(X, X[seeds,]); table(clus1$cluster, iris[,5])
clus2 <- kmeans(X, 3); table(clus2$cluster, iris[,5])

evanwang1990/ClusterTools documentation built on May 16, 2019, 9:37 a.m.