Description Usage Arguments Details Value Author(s) References See Also Examples
Estimates a finite Gaussian mixture model optimized over a snipping set.
1 2 |
X |
Data. |
k |
Number of clusters |
V |
Binary matrix of the same size as |
R |
Initial guess for cluster labels, |
restr.fact |
Restriction factor, i.e., constraint on the condition number of all covariance matrices for each cluster. Default is 12. |
tol |
Tolerance for convergence. Default is |
maxiters |
Maximum number of iterations for the SM algorithm. Default is |
maxiters.S |
Maximum number of iterations of the inner greedy snipping algorithm. Default is |
print.it |
Logical; if TRUE, partial results are print. Default is |
This function computes the sclust
estimator of Farcomeni
(2014). It leads to robust mixture modeling in presence of entry-wise outliers. It is
based on a classification-expectation-snip-maximize (CESM) algorithm. At the S step, the
likelihood is optimized over the set of snipped entries, at the M
step the location and scatter estimates are updated. The S step is
based on a greedy algorithm, unlike the one proposed in Farcomeni
(2014,2014a). The number of snipped entries sum(1-V)
is kept
fixed throughout. Note that initializing with labels arising from
classical (non-robust) clustering methods may be detrimental for the final
performance of sclust
and may even yield an error due to
empty clusters.
A list with the following elements:
R | Final cluster labels. |
mu | Estimated location matrix. |
S | Array of estimated scatter matrices. |
V | Final (optimal) V matrix. |
lik | Gaussian log-likelihood at convergence. |
iter | Number of outer iterations before convergence. |
Alessio Farcomeni alessio.farcomeni@uniroma1.it, Andy Leung andy.leung@stat.ubc.ca
Farcomeni, A. (2014) Snipping for robust k-means clustering under component-wise contamination, Statistics and Computing, 24, 909-917
Farcomeni, A. (2014) Robust constrained clustering in presence of entry-wise outliers, Technometrics, 56, 102-111
snipEM
, stEM
,
sumlog
,
ldmvnorm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | set.seed(1234)
X <- matrix(NA,200,5)
# two clusters
k <- 2
X[1:100,] <- rnorm(100*5)
X[101:200,] <- rnorm(100*5,15)
R <- rep(c(1,2), each=100)
# 5% cellwise outliers
s <- sample(200*5,200*5*0.05)
X[s] <- runif(200*5*0.05,-100,100)
V <- X
V[s] <- 0
V[-s] <- 1
# Initial V and R
Vinit <- matrix(1, nrow(X), ncol(X))
Vinit[which(X > quantile(X,0.975) | X < quantile(X,0.025))] <- 0
Rinit <- kmeans(X,2)$clust
# Snipped robust clustering
sc <- sclust(X,2,Vinit,Rinit)
table(R,Rinit)
table(R,sc$R)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.