21_init.EM: Initialization and EM Algorithm

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

These functions perform initializations (including em.EM and RndEM) followed by the EM iterations for model-based clustering of finite mixture multivariate Gaussian distribution with unstructured dispersion in both of unsupervised and semi-supervised clusterings.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
init.EM(x, nclass = 1, lab = NULL, EMC = .EMC,
        stable.solution = TRUE, min.n = NULL, min.n.iter = 10,
        method = c("em.EM", "Rnd.EM"))
em.EM(x, nclass = 1, lab = NULL, EMC = .EMC,
      stable.solution = TRUE, min.n = NULL, min.n.iter = 10)
rand.EM(x, nclass = 1, lab = NULL, EMC = .EMC.Rnd,
        stable.solution = TRUE, min.n = NULL, min.n.iter = 10)
exhaust.EM(x, nclass = 1, lab = NULL,
           EMC = .EMControl(short.iter = 1, short.eps = Inf),
           method = c("em.EM", "Rnd.EM"),
           stable.solution = TRUE, min.n = NULL, min.n.iter = 10);

Arguments

x

the data matrix, dimension n * p.

nclass

the desired number of clusters, K.

lab

labeled data for semi-supervised clustering, length n.

EMC

the control for the EM iterations.

stable.solution

if returning a stable solution.

min.n

restriction for a stable solution, the minimum number of observations for every final clusters.

min.n.iter

restriction for a stable solution, the minimum number of iterations for trying a stable solution.

method

an initialization method.

Details

The init.EM calls either em.EM if method="em.EM" or rand.EM if method="Rnd.EM".

The em.EM has two steps: short-EM has loose convergent tolerance controlled by .EMC$short.eps and try several random initializations controlled by .EMC$short.iter, while long-EM starts from the best short-EM result (in terms of log likelihood) and run to convergence with a tight tolerance controlled by .EMC$EM.eps.

The rand.EM also has two steps: first randomly pick several random initializations controlled by .EMC$short.iter, and second starts from the best of the random result (in terms of log likelihood) and run to convergence.

The lab is only for the semi-supervised clustering, and it contains pre-labeled indices between 1 and K for labeled observations. Observations with index 0 is non-labeled and has to be clustered by the EM algorithm. Indices will be assigned by the results of the EM algorithm. See demo(allinit_ss,'EMCluster') for details.

The exhaust.EM also calls the init.EM with different EMC and perform exhaust.iter times of EM algorithm with different initials. The best result is returned.

Value

These functions return an object emobj with class emret which can be used in post-process or other functions such as e.step, m.step, assign.class, em.ic, and dmixmvn.

Author(s)

Wei-Chen Chen [email protected] and Ranjan Maitra.

References

http://maitra.public.iastate.edu/

See Also

emcluster, .EMControl.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
## Not run: 
library(EMCluster, quietly = TRUE)
set.seed(1234)
x <- da1$da

ret.em <- init.EM(x, nclass = 10, method = "em.EM")
ret.Rnd <- init.EM(x, nclass = 10, method = "Rnd.EM", EMC = .EMC.Rnd)

emobj <- simple.init(x, nclass = 10)
ret.init <- emcluster(x, emobj, assign.class = TRUE)

par(mfrow = c(2, 2))
plotem(ret.em, x)
plotem(ret.Rnd, x)
plotem(ret.init, x)

## End(Not run)

Example output

Loading required package: MASS
Loading required package: Matrix
Warning: stack imbalance in '.Call', 16 then 8
Warning: stack imbalance in '<-', 14 then 6
Warning: stack imbalance in '<-', 2 then -6
Warning: stack imbalance in '{', 13 then 5
Warning: stack imbalance in '<-', 9 then 1
Warning: stack imbalance in '{', 6 then -2
Warning: stack imbalance in '<-', 2 then -6

EMCluster documentation built on Aug. 26, 2017, 9:04 a.m.