EMmixlcd: Estimate the mixture proportions and component densities...
In LogConcDEAD: Log-Concave Density Estimation in Arbitrary Dimensions

EMmixlcd

R Documentation

Estimate the mixture proportions and component densities using EM algorithm

Description

Uses EM algorithm to estimate the mixture proportions and the component densities. The output is an object of class "lcdmix" which contains mixture proportions at each observation and all the information of the estimated component densities.

Usage

  EMmixlcd( x, k = 2, y, props, epsratio=10^-6, max.iter=50,
            epstheta=10^-8, verbose=-1 )

Arguments

`x`	Data in `R^d`, in the form of an `n \times d` numeric `matrix`
`k`	The number of components, equals 2 by default
`y`	An `n \times k` numeric `matrix` giving the starting values for the EM algorithm. If none given, a hierachical Gaussian clustering model is used. To reduce the computational burden while allowing sufficient flexibility for the EM algorithm, it is recommended to leave this argument unspecified.
`props`	Vector of length `k` containing the starting value of proportions. If none given, a hierachical Gaussian clustering model is used. To reduce the computational burden while allowing sufficient flexibility for the EM algorithm, it is recommended to leave this argument unspecified.
`epsratio`	EM algorithm will terminate if the increase in the proportion of the likelihood is less than this specified ratio. Default value is `10^{-6}`.
`max.iter`	The maximum number of iterations for the EM algorithm
`epstheta`	`epstheta/n` is the thresold of the weight below which data point is discarded from the cluster. This quantity is introduced to increase the computational efficiency and stability.
`verbose`	-1: (default) prints nothing 0: prints warning messages `>0`: prints summary information every `n` iterations

Details

An introduction to the Em algorithm can be found in McLachlan and Krishnan (1997). Briefly, given the current estimates of the mixture proportions and component densities, we first update the estimates of the mixture prroportions. We then update the estimates of the component densities by using mlelcd. In fact, the incorporation of the weights in the maximization process in mlelcd presents no additional complication.

In our case, because of the computational intensity of the method, we first cluster the points according to ta hierarchical Gaussian clustering model and then iterate the EM algorithm until the increase in the proportion of the likelihood is less than a pre-specified quantity at each step.

More technical details can be found in Cule, Samworth and Stewart(2010)

Value

An object of class "lcdmix", with the following components:

`x`	Data copied from input (may be reordered)
`logf`	An `n \times k` `maxtrix` of the log of the maximum likelihood estimate, evaluated at the observation points for each component.
`props`	Vector containing the estimated proportions of components
`niter`	Number of iterations of the EM algorithm
`lcdloglik`	The log-likelihood after the final iteration

Author(s)

Yining Chen

Madeleine Cule

Robert B. Gramacy

Richard Samworth

References

Cule, M. L., Samworth, R. J., and Stewart, M. I. (2010) Maximum likelihood estimation of a log-concave density, Journal of the Royal Statistical Society, Series B, 72(5) p.545-607.

McLachlan, G. J. and Krishnan, T. (1997) The EM Algorithm and Extensions, New York: Wiley.

Examples

##Simple bivariate normal data
  set.seed( 1 )
  n = 15
  d = 2
  props=c( 0.6, 0.4 )
  shift=2
  x <- matrix( rnorm( n*d ), ncol = d )
  shiftvec <- ifelse( runif( n ) > props[ 1 ], 0, shift )
  x[,1] <- x[,1] + shiftvec
  EMmixlcd( x, k = 2, max.iter = 2)

LogConcDEAD documentation built on April 3, 2025, 11:55 p.m.