Description Usage Arguments Format Value References Examples
Robust Clustering algorithm based on centers, a robust and efficient version of K-Means.
1 |
X |
numeric matrix of size n x p. |
K |
the number of cluster. |
centers |
a matrix of size K x p containing the K initial
centers, one at each matrix-row. If centers is NULL a random
set of (distinct) rows in |
tolerance |
a tolerance parameter used for the algorithm stopping rule |
max_iter |
a maximum number of iterations used for the algorithm stopping rule |
n_runs |
the number of trials that the base algorithm
ktaucenters_aux is run. If it is greater than 1 and center is
not set as NULL, a random set of (distinct) rows in |
startWithKmeans |
TRUE if kmean centers values is included as starting point. |
startWithROBINPD |
TRUE if ROBINDEN estimator is included as starting point |
flag_outliers |
optional argument for outliers detection - quantiles of chi-square to be used as a threshold for outliers detection, defaults to 0.999 |
An object of class integer
of length 1.
A list including the estimated K centers and labels for the observations
centers
: matrix of size K x p, with the estimated K centers.
cluster
: array of size n x 1 integers labels between 1 and K.
tauPath
: sequence of tau scale values at each iterations.
Wni
: numeric array of size n x 1 indicating the weights
associated to each observation.
emptyClusterFlag
: a boolean value. True means that in some
iteration there were clusters totally empty
niter
: number of iterations until convergence is achieved
or maximun number of iteration is reached
di
: distance of each observation to its assigned cluster-center
outliers
: indices observation that can be considered as outliers
Gonzalez, J. D., Yohai, V. J., & Zamar, R. H. (2019). Robust Clustering Using Tau-Scales. arXiv preprint arXiv:1906.08198.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | # Generate Sinthetic data (three cluster well separated)
Z <- rnorm(600);
mus <- rep(c(-3, 0, 3), 200)
X <- matrix(Z + mus, ncol=2)
# Generate 60 sinthetic outliers (contamination level 20%)
X[sample(1:300,60), ] <- matrix(runif(40, 3 * min(X), 3 * max(X)),
ncol = 2, nrow = 60)
### Applying the algorithm ####
sal <- ktaucenters(
X, K=3, centers=X[sample(1:300,3), ],
tolerance=1e-3, max_iter=100)
### plotting the clusters ###
oldpar = par(mfrow = c(1, 2))
plot(X, type = 'n', main = 'ktaucenters (Robust) \n outliers: solid black dots')
points(X[sal$cluster == 1, ], col = 2);
points(X[sal$cluster == 2, ], col = 3);
points(X[sal$cluster == 3, ], col = 4)
points(X[sal$outliers, 1], X[sal$outliers, 2], pch = 19)
### Applying a classical (non Robust) algortihm ###
sal <- kmeans(X, centers = 3, n_runs = 100)
### plotting the clusters ###
plot(X, type = 'n', main = 'kmeans (Classical)')
points(X[sal$cluster == 1, ], col = 2);
points(X[sal$cluster == 2, ], col = 3);
points(X[sal$cluster == 3, ], col = 4)
par(oldpar)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.