bmm.fixed.num.components: bmm.fixed.num.components: Fits a Beta mixture with fixed...

View source: R/bmm.R

bmm.fixed.num.componentsR Documentation

bmm.fixed.num.components: Fits a Beta mixture with fixed number of components using a variational Bayesian approach

Description

bmm.fixed.num.components uses a variational Bayesian approach to fit a mixture of Beta distributions to proportion data, without dropping any components/clusters. To instead automatically determine the number of components, use bmm, which invokes this function.

This implements the derivations described in

Bayesian Estimation of Beta Mixture Models with Variational Inference. Ma and Leijon. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011) 33: 2160-2173.

and

Variational Learning for Finite Dirichlet Mixture Models and Applications. Fan, Bouguila, and Ziou. IEEE Transactions on Neural Networks and Learning Systems (2012) 23: 762-774.

Notation and references here follow that used in Ma and Leijon.

Usage

   bmm.fixed.num.components(X, N.c, r, mu, alpha, nu, beta, c, E.pi, 
                            mu0, alpha0, nu0, beta0, c0, 
                            convergence.threshold = 10^-4,
                            max.iterations = 10000,
                            verbose = 0)

Arguments

X

an N x D matrix with rows being the items to cluster. All entries are assumed to be proportions (i.e., between 0 and 1). Notice that there are no summation restrictions–i.e., proportions do not sum to unity across an item's dimensions.

N.c

the number of components/clusters to attempt

r

the N x N.c matrix of initial responsibilities, with r[n, nc] giving the probability that item n belongs to component nc

mu

a D x N.c matrix holding the _initial_ values of the shape parameters for the gamma prior distributions over the u parameters. i.e., mu[d,n] is the shape parameter governing u[d,n]. NB: this is the initial value mu, which is updated upon iteration. It is not (necessarily) the same as the hyperparameter mu0, which is unchanged by iteration. Introduced in eqn (15).

alpha

a D x N.c matrix holding the _initial_ values of the rate (i.e., inverse scale) parameters for the gamma prior distributions over the u parameters. i.e., mu[d,n] is the rate parameter governing u[d,n]. Introduced in eqn (15). NB: this is the initial value alpha, which is updated upon iteration. It is not (necessarily) the same as the hyperparameter alpha0, which is unchanged by iteration.

nu

a D x N.c matrix holding the _initial_ values of the shape parameters for the gamma prior distributions over the v parameters. i.e., nu[d,n] is the shape parameter governing v[d,n]. Introduced in eqn (16). NB: this is the initial value nu, which is updated upon iteration. It is not (necessarily) the same as the hyperparameter nu0, which is unchanged by iteration.

beta

a D x N.c matrix holding the _initial_ values of the rate (i.e., inverse scale) parameters for the gamma prior distributions over the v parameters. i.e., beta[d,n] is the rate parameter governing v[d,n]. Introduced in eqn (16). NB: this is the initial value beta, which is updated upon iteration. It is not (necessarily) the same as the hyperparameter beta0, which is unchanged by iteration.

c

a vector with D components holding the _initial_ values of the parameters of the Dirichlet distribution over the mixing coefficients pi. Introduced in eqn (19). NB: this is the initial value c, which is updated upon iteration. It is not (necessarily) the same as the hyperparameter c0, which is unchanged by iteration.

E.pi

the D-vector holding the values E[pi], i.e., the expected values of the mixing coefficients, defined in eqn (53).

mu0, alpha0, nu0, beta0, c0

the hyperparameters corresponding to the above initial values (and with the same respective matrix/vector dimensionality).

convergence.threshold

minimum absolute difference between mixing coefficient (expected) values across consecutive iterations to reach converge.

max.iterations

maximum number of iterations to attempt

verbose

output progress in terms of mixing coefficient (expected) values if 1.

Value

A list with the following entries:

retVal

0 indicates successful convergence; -1 indicates a failure to converge.

mu

a D x N.c matrix holding the _converged final_ values of the shape parameters for the gamma prior distributions over the u parameters. i.e., mu[d,n] is the shape parameter governing u[d,n]. Introduced in eqn (15).

alpha

a D x N.c matrix holding the _converged final_ values of the rate (i.e., inverse scale) parameters for the gamma prior distributions over the u parameters. i.e., mu[d,n] is the rate parameter governing u[d,n]. Introduced in eqn (15).

nu

a D x N.c matrix holding the _converged final_ values of the shape parameters for the gamma prior distributions over the v parameters. i.e., nu[d,n] is the shape parameter governing v[d,n]. Introduced in eqn (16).

beta

a D x N.c matrix holding the _converged final_ values of the rate (i.e., inverse scale) parameters for the gamma prior distributions over the v parameters. i.e., beta[d,n] is the rate parameter governing v[d,n]. Introduced in eqn (16).

c

a vector with D components holding the _converged final_ values of the parameters of the Dirichlet distribution over the mixing coefficients pi. Introduced in eqn (19).

r

the N x N.c matrix of responsibilities, with r[n, nc] giving the probability that item n belongs to component nc

num.iterations

the number of iterations required to reach convergence.

ln.rho

an N x N.c matrix holding the ln[rho], as defined in eqn (32).

E.lnu

the D x N.c matrix holding the values E_u[ln u], defined following eqn (51).

E.lnv

the D x N.c matrix holding the values E_v[ln v], defined following eqn (51).

E.lnpi

the D-vector holding the values E[ln pi], defined following eqn (51).

E.pi

the D-vector holding the values E[pi], i.e., the expected values of the mixing coefficients, defined in eqn (53).

E.quadratic.u

the D x N.c matrix holding the values E_u[(ln u - ln u^bar)^2] defined following eqn (51).

E.quadratic.v

the D x N.c matrix holding the values E_v[(ln v - ln v^bar)^2] defined following eqn (51).

ubar

the D x N.c matrix holding values ubar = mu/alpha defined following eqn (51).

vbar

the D x N.c matrix holding values vbar = nu/beta defined following eqn (51).


genome/bmm documentation built on Aug. 4, 2022, 8:01 a.m.