RMBC: Robust Model Base Clustering a robust and efficient version...

Description Usage Arguments Value Examples

View source: R/RMBC.R

Description

Robust Model Base Clustering a robust and efficient version of EM algorithm.

Usage

1
RMBC(Y, K, max_iter = 80, tolerance = 1e-04)

Arguments

Y

A matrix of size n x p.

K

The number of clusters.

max_iter

a maximum number of iterations used for the algorithm stopping rule

tolerance

tolerance parameter used for the algorithm stopping rule

Value

A list including the estimated mixture distribution parameters and cluster-label for the observations

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# Generate Sintetic data (three normal cluster in two dimension)
# clusters have different shapes and orentation.
# The data is contaminated uniformly (level 20%).
################################################
#### Start data generating process ############
##############################################

# generates base clusters

Z1 <- c(rnorm(50,0),rnorm(50,0),rnorm(50,0))
Z2 <- rnorm(150);
X <-  matrix(0, ncol=2,nrow=150);
X[,1]=Z1;X[,2]=Z2
true.cluster= c(rep(1,50),rep(2,50),rep(3,50))
# rotate, expand and translate base clusters
theta=pi/3;
aux1=matrix(c(cos(theta),-sin(theta),sin(theta),cos(theta)),nrow=2)

aux2=sqrt(4)*diag(c(1,1/4))

B=aux1%*%aux2%*%t(aux1)

X[true.cluster==3,]=X[true.cluster==3,]%*%aux2%*%aux1 + 
matrix(c(15,2), byrow = TRUE,nrow=50,ncol=2)
X[true.cluster==2,2] = X[true.cluster==2,2]*4
X[true.cluster==1,2] = X[true.cluster==1,2]*0.1
X[true.cluster==1, ] = X[true.cluster==1,]+ 
matrix(c(-15,-1),byrow = TRUE,nrow=50,ncol=2)

### Generate 30 sintetic outliers (contamination level 20%)

outliers=sample(1:150,30)
X[outliers, ] <- matrix(runif( 60, 2 * min(X), 2 * max(X) ),
                        ncol = 2, nrow = 30)

###############################################
#### END data generating process ############
#############################################

### APLYING RMBC ALGORITHM 

ret = RMBC(Y=X, K=3,max_iter = 82)

cluster = ret$cluster
#############################################
### plotting results ########################
#############################################
oldpar=par(mfrow=c(1,2))
plot(X,  main="actual clusters" )
for (j in 1:3){
  points(X[true.cluster==j,],pch=19, col=j+1)
}
points(X[outliers,],pch=19,col=1)

plot(X,main="clusters estimation")
for (j in 1:3){
  points(X[cluster==j,],pch=19, col=j+1)
}
points(X[ret$outliers,],pch=19,col=1)
par(oldpar)

RMBC documentation built on July 22, 2021, 9:07 a.m.

Related to RMBC in RMBC...