Description Usage Arguments Value References Examples
Robust Clustering algorithm based on centers, a robust and efficient version of KMeans.
1 2 3 | ktaucenters(X, K, centers = NULL, tolmin = 1e-06, NiterMax = 100,
nstart = 1, startWithKmeans = TRUE, startWithROBINPD = TRUE,
cutoff = 0.999)
|
X |
numeric matrix of size n x p. |
K |
the number of cluster. |
centers |
a matrix of size K x p containing the K initial centers,
one at each matrix-row. If centers is NULL a random set of (distinct) rows in |
tolmin |
a tolerance parameter used for the algorithm stopping rule |
NiterMax |
a maximun number of iterations used for the algorithm stopping rule |
nstart |
the number of trials that the base algorithm ktaucenters_aux is run.
If it is greater than 1 and center is not set as NULL, a random set of (distinct) rows
in |
startWithKmeans |
TRUE if kmean centers values is included as starting point. |
startWithROBINPD |
TRUE if ROBINDEN estimator is included as starting point |
cutoff |
optional argument for outliers detection - quantiles of chi-square to be used as a threshold for outliers detection, defaults to 0.999 |
A list including the estimated K centers and labels for the observations
centers
: matrix of size K x p, with the estimated K centers.
cluster
: array of size n x 1 integers labels between 1 and K.
tauPath
: sequence of tau scale values at each iterations.
Wni
: numeric array of size n x 1 indicating the weights associated to each observation.
emptyClusterFlag
: a boolean value. True means that in some iteration there were clusters totally empty
niter
: number of iterations untill convergence is achived or maximun number of iteration is reached
di
: distance of each observation to its assigned cluster-center
outliers
: indices observation that can be considered as outliers
Gonzalez, J. D., Yohai, V. J., & Zamar, R. H. (2019). Robust Clustering Using Tau-Scales. arXiv preprint arXiv:1906.08198.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | # Generate Sintetic data (three cluster well separated)
Z <- rnorm(600);
mues <- rep(c(-3, 0, 3), 200)
X <- matrix(Z + mues, ncol=2)
# Generate 60 sintetic outliers (contamination level 20%)
X[sample(1:300,60), ] <- matrix(runif( 40, 3 * min(X), 3 * max(X) ),
ncol = 2, nrow = 60)
### Applying the algortihm ####
sal <- ktaucenters(
X, K=3, centers=X[sample(1:300,3), ],
tolmin=1e-3, NiterMax=100)
#### plotting the clusters####
par(mfrow = c(1,1));
par(mfrow = c(1,2))
plot(X,type = "n", main = "ktaucenters (Robust) \n outliers: solid black dots")
points(X[sal$cluster==1,],col=2);
points(X[sal$cluster==2,],col=3);
points(X[sal$cluster==3,],col=4)
points(X[sal$outliers,1], X[sal$outliers,2], pch=19)
### Applying a classical (non Robust) algortihm ###
sal <- kmeans(X, centers=3,nstart=100)
### plotting the clusters ###
plot(X, type ="n", main = "kmeans (Classical)")
points(X[sal$cluster==1,],col=2);
points(X[sal$cluster==2,],col=3);
points(X[sal$cluster==3,],col=4)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.