Initialization and EM | R Documentation |
These functions perform initializations (including em.EM and RndEM) followed by the EM iterations for model-based clustering of finite mixture multivariate Gaussian distribution with unstructured dispersion in both of unsupervised and semi-supervised clusterings.
init.EM(x, nclass = 1, lab = NULL, EMC = .EMC,
stable.solution = TRUE, min.n = NULL, min.n.iter = 10,
method = c("em.EM", "Rnd.EM"))
em.EM(x, nclass = 1, lab = NULL, EMC = .EMC,
stable.solution = TRUE, min.n = NULL, min.n.iter = 10)
rand.EM(x, nclass = 1, lab = NULL, EMC = .EMC.Rnd,
stable.solution = TRUE, min.n = NULL, min.n.iter = 10)
exhaust.EM(x, nclass = 1, lab = NULL,
EMC = .EMControl(short.iter = 1, short.eps = Inf),
method = c("em.EM", "Rnd.EM"),
stable.solution = TRUE, min.n = NULL, min.n.iter = 10);
x |
the data matrix, dimension |
nclass |
the desired number of clusters, |
lab |
labeled data for semi-supervised clustering,
length |
EMC |
the control for the EM iterations. |
stable.solution |
if returning a stable solution. |
min.n |
restriction for a stable solution, the minimum number of observations for every final clusters. |
min.n.iter |
restriction for a stable solution, the minimum number of iterations for trying a stable solution. |
method |
an initialization method. |
The init.EM
calls either em.EM
if method="em.EM"
or
rand.EM
if method="Rnd.EM"
.
The em.EM
has two steps: short-EM has loose convergent
tolerance controlled by .EMC$short.eps
and try several random
initializations controlled by .EMC$short.iter
, while long-EM
starts from the best short-EM result (in terms of log likelihood) and
run to convergence with a tight tolerance controlled by .EMC$em.eps
.
The rand.EM
also has two steps: first randomly pick several
random initializations controlled by .EMC$short.iter
, and
second starts from the best of the random result
(in terms of log likelihood) and run to convergence.
The lab
is only for the semi-supervised clustering, and it contains
pre-labeled indices between 1 and K
for labeled observations.
Observations with index 0 is non-labeled and has to be clustered by
the EM algorithm. Indices will be assigned by the results of the EM
algorithm. See demo(allinit_ss,'EMCluster')
for details.
The exhaust.EM
also calls the init.EM
with different
EMC
and perform exhaust.iter
times of EM algorithm
with different initials. The best result is returned.
These functions return an object emobj
with class emret
which can be used in post-process or other functions such as
e.step
, m.step
, assign.class
, em.ic
,
and dmixmvn
.
Wei-Chen Chen wccsnow@gmail.com and Ranjan Maitra.
https://www.stat.iastate.edu/people/ranjan-maitra
emcluster
, .EMControl
.
## Not run:
library(EMCluster, quietly = TRUE)
set.seed(1234)
x <- da1$da
ret.em <- init.EM(x, nclass = 10, method = "em.EM")
ret.Rnd <- init.EM(x, nclass = 10, method = "Rnd.EM", EMC = .EMC.Rnd)
emobj <- simple.init(x, nclass = 10)
ret.init <- emcluster(x, emobj, assign.class = TRUE)
par(mfrow = c(2, 2))
plotem(ret.em, x)
plotem(ret.Rnd, x)
plotem(ret.init, x)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.