MoE_cstep: C-step for MoEClust Models

MoE_cstepR Documentation

C-step for MoEClust Models

Description

Function to compute the assignment matrix z and the conditional log-likelihood for MoEClust models, with the aid of MoE_dens.

Usage

MoE_cstep(data,
          mus,
          sigs,
          log.tau = 0L,
          Vinv = NULL,
          Dens = NULL)

Arguments

data

If there are no expert network covariates, data should be a numeric matrix or data frame, wherein rows correspond to observations (n) and columns correspond to variables (d). If there are expert network covariates, this should be a list of length G containing matrices/data.frames of (multivariate) WLS residuals for each component.

mus

The mean for each of G components. If there is more than one component, this is a matrix whose k-th column is the mean of the k-th component of the mixture model. For the univariate models, this is a G-vector of means. In the presence of expert network covariates, all values should be equal to 0.

sigs

The variance component in the parameters list from the output to e.g. MoE_clust. The components of this list depend on the specification of modelName (see mclustVariance for details). The number of components G, the number of variables d, and the modelName are inferred from sigs.

log.tau

If covariates enter the gating network, an n times G matrix of mixing proportions, otherwise a G-vector of mixing proportions for the components of the mixture. Must be on the log-scale in both cases. The default of 0 effectively means densities (or log-densities) aren't scaled by the mixing proportions.

Vinv

An estimate of the reciprocal hypervolume of the data region. See the function noise_vol. Used only if an initial guess as to which observations are noise is supplied. Mixing proportion(s) must be included for the noise component also.

Dens

(Optional) A numeric matrix whose [i,k]-th entry is the log-density of observation i in component k, scaled by the mixing proportions, to which the function is to be applied, typically obtained by MoE_dens but this is not necessary. If this is supplied, all other arguments are ignored, otherwise MoE_dens is called according to the other supplied arguments.

Value

A list containing two elements:

z

A matrix with n rows and G columns containing 1 where the observation belongs to the cluster indicated by the column number, and 0 otherwise.

loglik

The estimated conditional log-likelihood.

Note

This function is intended for joint use with MoE_dens, using the log-densities. Caution is advised using this function without explicitly naming the arguments. Models with a noise component are facilitated here too.

The C-step can be replaced by an E-step, see MoE_estep and the algo argument to MoE_control.

Author(s)

Keefe Murphy - <keefe.murphy@mu.ie>

See Also

MoE_dens, MoE_clust, MoE_estep, MoE_control, mclustVariance

Examples

# MoE_cstep can be invoked for fitting MoEClust models via the CEM algorithm
# via the 'algo' argument to MoE_control:
data(ais)
hema   <- ais[,3:7]
model  <- MoE_clust(hema, G=3, gating= ~ BMI + sex, modelNames="EEE", network.data=ais, algo="CEM")
Dens   <- MoE_dens(data=hema, mus=model$parameters$mean,
                   sigs=model$parameters$variance, log.tau=log(model$parameters$pro))

# Construct the z matrix and compute the conditional log-likelihood
Cstep  <- MoE_cstep(Dens=Dens)
(ll    <- Cstep$loglik)

# Check that the z matrix & classification are the same as those from the model
identical(max.col(Cstep$z), as.integer(unname(model$classification))) #TRUE
identical(Cstep$z, model$z)                                           #TRUE

# Call MoE_cstep directly
Cstep2 <- MoE_cstep(data=hema, sigs=model$parameters$variance,
                    mus=model$parameters$mean, log.tau=log(model$parameters$pro))
identical(Cstep2$loglik, ll)                                          #TRUE

Keefe-Murphy/MoEClust documentation built on Feb. 1, 2024, 4:36 a.m.