MAX_ITER_DEFAULT: ktaucenters

Description Usage Arguments Format Value References Examples

Description

Robust Clustering algorithm based on centers, a robust and efficient version of K-Means.

Usage

1

Arguments

X

numeric matrix of size n x p.

K

the number of cluster.

centers

a matrix of size K x p containing the K initial centers, one at each matrix-row. If centers is NULL a random set of (distinct) rows in X are chosen as the initial centres.

tolerance

a tolerance parameter used for the algorithm stopping rule

max_iter

a maximum number of iterations used for the algorithm stopping rule

n_runs

the number of trials that the base algorithm ktaucenters_aux is run. If it is greater than 1 and center is not set as NULL, a random set of (distinct) rows in X will be chosen as the initial centers.

startWithKmeans

TRUE if kmean centers values is included as starting point.

startWithROBINPD

TRUE if ROBINDEN estimator is included as starting point

flag_outliers

optional argument for outliers detection - quantiles of chi-square to be used as a threshold for outliers detection, defaults to 0.999

Format

An object of class integer of length 1.

Value

A list including the estimated K centers and labels for the observations

References

Gonzalez, J. D., Yohai, V. J., & Zamar, R. H. (2019). Robust Clustering Using Tau-Scales. arXiv preprint arXiv:1906.08198.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Generate Sinthetic data (three cluster well separated)
Z <- rnorm(600);
mus <- rep(c(-3, 0, 3), 200)
X <-  matrix(Z + mus, ncol=2)

# Generate 60 sinthetic outliers (contamination level 20%)
X[sample(1:300,60), ] <- matrix(runif(40, 3 * min(X), 3 * max(X)),
                                ncol = 2, nrow = 60)

### Applying the algorithm ####
sal <- ktaucenters(
     X, K=3, centers=X[sample(1:300,3), ],
     tolerance=1e-3, max_iter=100)

### plotting the clusters ###

oldpar = par(mfrow = c(1, 2))

plot(X, type = 'n', main = 'ktaucenters (Robust) \n outliers: solid black dots')
points(X[sal$cluster == 1, ], col = 2);
points(X[sal$cluster == 2, ], col = 3);
points(X[sal$cluster == 3, ], col = 4)
points(X[sal$outliers, 1], X[sal$outliers, 2], pch = 19)

### Applying a classical (non Robust) algortihm ###
sal <- kmeans(X, centers = 3, n_runs = 100)

### plotting the clusters ###
plot(X, type = 'n', main = 'kmeans (Classical)')
points(X[sal$cluster == 1, ], col = 2);
points(X[sal$cluster == 2, ], col = 3);
points(X[sal$cluster == 3, ], col = 4)

par(oldpar)

anevolbap/ktaucenterscpp documentation built on March 10, 2021, 10:12 a.m.