fitmix: Fit finite mixture of univariate Gaussian densities to data.

Description Usage Arguments Details Value Author(s) Examples

View source: R/fitmix.R

Description

Implementation of EM algorithm to fit k component univariate Gaussian mixture to data, for user specified value of k. In constrast to general purpose (unconstrained) Gaussian mixture models, this function allows certain restrictions or parameter space reductions that make the model correspond more closely to a classical quantitative genetics model.

Usage

1
2
fitmix(x, k, tol = 1e-6, maxit = 100, restarts = 20,
       p.binomial = FALSE, mu.additive = FALSE, sigma.common = FALSE)

Arguments

x

Real vector of data to be fitted by the model.

k

Number of components in the mixture.

tol

Threshold for log-likelihood increase for convergence.

maxit

Maximum number of iterations for each run of the EM algorithm.

restarts

Number of times to restart EM algorithm at random initial points.

p.binomial

Assume mixture proportions correspond to binomial expansion (corresponds to Hardy-Weinberg for k=3).

mu.additive

Assume means follow additive model.

sigma.common

Assume variances are same for all components.

Details

Most likely applications are k=2 with sigma.common=TRUE for a completely dominant or recessive genetic model, and k=3 with p.binomial=TRUE, mu.additive=TRUE and sigma.common=TRUE for a codominant biallelic genetic model.

Note that the unconstrained model has many global optima that allow unbounded increase in the log likelihood as one (or more) mixture components have sigma tend to zero as mu tends to one of the data observed values. The EM algorithm rarely converges to these optima since they typically have very small domains of attraction.

Value

A list. The elements p, mu, and sigma, are each vectors of length k, and describe the mixture found that maximised the likelihood. An element loglik is a vector of length restarts+1 and gives the log likelihood at the end of each run of the EM algorithm. Inspecting loglik may give some indication of whether a “good” local optimum has been found.

Author(s)

Toby Johnson Toby.x.Johnson@gsk.com

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
xx <- fitmix.simulate(100, c(0.49, 0.42, 0.09), c(0, 1, 2), c(.3, .3, .3))

## additive model, common variance, Hardy--Weinberg
fit.a <- fitmix(xx, 3, maxit = 10, restarts = 3,
                sigma.common = TRUE, p.binomial = TRUE, mu.additive = TRUE)
## general (unrestricted) fit
fit.g <- fitmix(xx, 3, maxit = 10, restarts = 3)

oldpar <- par(mfrow = c(1, 2))
fitmix.plot(xx, fit.a$p, fit.a$mu, fit.a$sigma)
fitmix.plot(xx, fit.g$p, fit.g$mu, fit.g$sigma)
par(oldpar)

tobyjohnson/gtx documentation built on Aug. 30, 2019, 8:07 p.m.