normalmixEM2comp: Fast EM Algorithm for 2-Component Mixtures of Univariate...
In mixtools: Tools for Analyzing Finite Mixture Models

normalmixEM2comp

R Documentation

Fast EM Algorithm for 2-Component Mixtures of Univariate Normals

Description

Return EM algorithm output for mixtures of univariate normal distributions for the special case of 2 components, exploiting the simple structure of the problem to speed up the code.

Usage

normalmixEM2comp(x, lambda, mu, sigsqrd, eps= 1e-8, maxit = 1000, verb=FALSE)

Arguments

`x`	A vector of length `n` consisting of the data.
`lambda`	Initial value of first-component mixing proportion.
`mu`	A 2-vector of initial values for the mean parameters.
`sigsqrd`	Either a scalar or a 2-vector with initial value(s) for the variance parameters. If a scalar, the algorithm assumes that the two components have equal variances; if a 2-vector, it assumes that the two components do not have equal variances.
`eps`	The convergence criterion. Convergence is declared when the change in the observed data log-likelihood increases by less than epsilon.
`maxit`	The maximum possible number of iterations.
`verb`	If TRUE, then various updates are printed during each iteration of the algorithm.

Details

This code is written to be very fast, sometimes more than an order of magnitude faster than normalmixEM for the same problem. It is less numerically stable that normalmixEM in the sense that it does not safeguard against underflow as carefully.

Note that when the two components are assumed to have unequal variances, the loglikelihood is unbounded. However, in practice this is rarely a problem and quite often the algorithm converges to a "nice" local maximum.

Value

normalmixEM2comp returns a list of class mixEM with items:

`x`	The raw data.
`lambda`	The final mixing proportions (lambda and 1-lambda).
`mu`	The final two mean parameters.
`sigma`	The final one or two standard deviations.
`loglik`	The final log-likelihood.
`posterior`	An nx2 matrix of posterior probabilities for observations.
`all.loglik`	A vector of each iteration's log-likelihood. This vector includes both the initial and the final values; thus, the number of iterations is one less than its length.
`restarts`	The number of times the algorithm restarted due to unacceptable choice of initial values (always zero).
`ft`	A character vector giving the name of the function.

References

McLachlan, G. J. and Peel, D. (2000) Finite Mixture Models, John Wiley and Sons, Inc.

Examples

##Analyzing the Old Faithful geyser data with a 2-component mixture of normals.

data(faithful)
attach(faithful)
set.seed(100)
system.time(out <- normalmixEM2comp(waiting, lambda=.5, 
            mu=c(50,80), sigsqrd=100))
out$all.loglik # Note:  must be monotone increasing

# Compare elapsed time with more general version
system.time(out2 <- normalmixEM(waiting, lambda=c(.5,.5), 
            mu=c(50,80), sigma=c(10,10), arbvar=FALSE))
out2$all.loglik # Values should be identical to above

mixtools documentation built on April 12, 2025, 2:31 a.m.