ktaucenters: ktaucenters

Description Usage Arguments Value References Examples

View source: R/ktaucenters.R

Description

Robust Clustering algorithm based on centers, a robust and efficient version of KMeans.

Usage

1
2
3
ktaucenters(X, K, centers = NULL, tolmin = 1e-06, NiterMax = 100,
  nstart = 1, startWithKmeans = TRUE, startWithROBINPD = TRUE,
  cutoff = 0.999)

Arguments

X

numeric matrix of size n x p.

K

the number of cluster.

centers

a matrix of size K x p containing the K initial centers, one at each matrix-row. If centers is NULL a random set of (distinct) rows in X are chosen as the initial centres.

tolmin

a tolerance parameter used for the algorithm stopping rule

NiterMax

a maximun number of iterations used for the algorithm stopping rule

nstart

the number of trials that the base algorithm ktaucenters_aux is run. If it is greater than 1 and center is not set as NULL, a random set of (distinct) rows in X will be chosen as the initial centres.

startWithKmeans

TRUE if kmean centers values is included as starting point.

startWithROBINPD

TRUE if ROBINDEN estimator is included as starting point

cutoff

optional argument for outliers detection - quantiles of chi-square to be used as a threshold for outliers detection, defaults to 0.999

Value

A list including the estimated K centers and labels for the observations

References

Gonzalez, J. D., Yohai, V. J., & Zamar, R. H. (2019). Robust Clustering Using Tau-Scales. arXiv preprint arXiv:1906.08198.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Generate Sintetic data (three cluster well separated)
Z <- rnorm(600);
mues <- rep(c(-3, 0, 3), 200)
X <-  matrix(Z + mues, ncol=2)

# Generate 60 sintetic outliers (contamination level 20%)
X[sample(1:300,60), ] <- matrix(runif( 40, 3 * min(X), 3 * max(X) ),
                                ncol = 2, nrow = 60)

### Applying the algortihm ####
sal <- ktaucenters(
     X, K=3, centers=X[sample(1:300,3), ],
     tolmin=1e-3, NiterMax=100)

#### plotting  the clusters####
par(mfrow = c(1,1));

par(mfrow = c(1,2))
plot(X,type = "n", main = "ktaucenters (Robust) \n outliers: solid black dots")
points(X[sal$cluster==1,],col=2);
points(X[sal$cluster==2,],col=3);
points(X[sal$cluster==3,],col=4)
points(X[sal$outliers,1], X[sal$outliers,2], pch=19)

### Applying a classical (non Robust) algortihm ###
sal <- kmeans(X, centers=3,nstart=100)

### plotting the clusters ###
plot(X, type ="n", main = "kmeans (Classical)")
points(X[sal$cluster==1,],col=2);
points(X[sal$cluster==2,],col=3);
points(X[sal$cluster==3,],col=4)

ktaucenters documentation built on Aug. 3, 2019, 9:03 a.m.