MoE_crit: MoEClust BIC, ICL, and AIC Model-Selection Criteria

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

Computes the BIC (Bayesian Information Criterion), ICL (Integrated Complete Likelihood), and AIC (Akaike Information Criterion) for parsimonious mixture of experts models given the log-likelihood, the dimension of the data, the number of mixture components in the model, the numbers of parameters in the gating and expert networks respectively, and, for the ICL, the numbers of observations in each component.

Usage

1
2
3
4
5
6
7
8
9
MoE_crit(modelName,
         loglik,
         n,
         d,
         G,
         gating.pen = G - 1L,
         expert.pen = G * d,
         z = NULL,
         df = NULL)

Arguments

modelName

A character string indicating the model. The help file for mclustModelNames describes the available models. Not necessary if df is supplied.

loglik

The log-likelihood for a data set with respect to the Gaussian mixture model specified in the modelName argument.

n, d, G

The number of observations in the data, dimension of the data, and number of components in the Gaussian mixture model, respectively, used to compute loglik. d & G are not necessary if df is supplied.

gating.pen

The number of parameters of the gating network of the MoEClust model. Defaults to G - 1, which corresponds to no gating covariates. If covariates are included, this should be the number of regression coefficients in the fitted gating object. If there are no covariates and mixing proportions are further assumed to be present in equal proportion, gating.pen should be 0. The number of parameters used in the estimation of the noise component, if any, should also be included. Not necessary if df is supplied.

expert.pen

The number of parameters of the expert network of the MoEClust model. Defaults to G * d, which corresponds to no expert covariates. If covariates are included, this should be the number of regression coefficients in the fitted expert object. Not necessary if df is supplied.

z

The n times G responsibility matrix whose [i,k]-th entry is the probability that observation i belonds to the k-th component.. If supplied the ICL is also computed and returned, otherwise only the BIC and AIC.

df

An alternative way to specify the number of estimated parameters (or 'used' degrees of freedom) exactly. If supplied, the arguments modelName, d, G, gating.pen, and expert.pen, which are used to calculate the number of parameters, will be ignored. The number of parameters used in the estimation of the noise component, if any, should also be included.

Details

The function is vectorised with respect to the arguments modelName and loglik.

If model is an object of class "MoEClust" with G components, the number of parameters for the gating.pen and expert.pen are length(coef(model$gating)) and G * length(coef(model$expert[[1]])), respectively.

Models with a noise component are facilitated here too provided the extra number of parameters are accounted for by the user.

Value

A simplified array containing the BIC, AIC, number of estimated parameters (df) and, if z is supplied, also the ICL, for each of the given input arguments.

Note

In order to speed up repeated calls to the function inside MoE_clust, no checks take place.

Author(s)

Keefe Murphy - <keefe.murphy@mu.ie>

References

Biernacki, C., Celeux, G. and Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Analysis and Machine Intelligence, 22(7): 719-725.

See Also

MoE_clust, nVarParams, mclustModelNames

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
MoE_crit(modelName=c("VVI", "VVE", "VVV"), n=120, d=8,
         G=3, loglik=c(-4036.99, -3987.12, -3992.45))

data(CO2data)
GNP   <- CO2data$GNP
model <- MoE_clust(CO2data$CO2, G=1:2, expert= ~ GNP)
G     <- model$G
name  <- model$modelName
ll    <- max(model$loglik)
n     <- length(CO2data$CO2)
z     <- model$z

# Compare BIC from MoE_crit to the BIC of the model
(bic2 <- MoE_crit(modelName=name, loglik=ll, n=n, d=1, G=G, z=z,
                  expert.pen=G * length(coef(model$expert[[1]])))["bic",])
identical(unname(bic2), model$bic) #TRUE

# Make the same comparison with the known number of estimated parameters
(bic3 <- MoE_crit(loglik=ll, n=n, df=model$df, z=z)["bic",])
identical(bic3, bic2)              #TRUE

Keefe-Murphy/MoEClust documentation built on Jan. 11, 2021, 6:34 p.m.