AIC3Function: Model Selection Via Akaike Information Criterion by Bozdogan...

View source: R/mplnMCMCEMClustering.R

AIC3FunctionR Documentation

Model Selection Via Akaike Information Criterion by Bozdogan (1994)

Description

Performs model selection using Akaike Information Criterion by Bozdogan (1994), called AIC3. Formula: - 2 * logLikelihood + 3 * nParameters.

Usage

AIC3Function(
  logLikelihood,
  nParameters,
  clusterRunOutput = NA,
  gmin,
  gmax,
  parallel = FALSE
)

Arguments

logLikelihood

A vector with value of final log-likelihoods for each cluster size.

nParameters

A vector with number of parameters for each cluster size.

clusterRunOutput

Output from mplnVariational, mplnMCMCParallel, or mplnMCMCNonParallel, if available. Default value is NA. If provided, the vector of cluster labels obtained by mclust::map() for best model will be provided in the output.

gmin

A positive integer specifying the minimum number of components to be considered in the clustering run.

gmax

A positive integer, >gmin, specifying the maximum number of components to be considered in the clustering run.

parallel

TRUE or FALSE indicating if MPLNClust::mplnMCMCParallel has been used.

Value

Returns an S3 object of class MPLN with results.

  • allAIC3values - A vector of AIC3 values for each cluster size.

  • AIC3modelselected - An integer specifying model selected by AIC3.

  • AIC3modelSelectedLabels - A vector of integers specifying cluster labels for the model selected. Only provided if user input clusterRunOutput.

  • AIC3Message - A character vector indicating if spurious clusters are detected. Otherwise, NA.

Author(s)

Anjali Silva, anjali@alumni.uoguelph.ca

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory, New York, NY, USA, pp. 267–281. Springer Verlag.

#' Bozdogan, H. (1994). Mixture-model cluster analysis using model selection criteria and a new informational measure of complexity. In Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach: Volume 2 Multivariate Statistical Modeling, pp. 69–113. Dordrecht: Springer Netherlands.

Examples

trueMu1 <- c(6.5, 6, 6, 6, 6, 6)
trueMu2 <- c(2, 2.5, 2, 2, 2, 2)

trueSigma1 <- diag(6) * 2
trueSigma2 <- diag(6)

# Generating simulated data
sampleData <- MPLNClust::mplnDataGenerator(nObservations = 100,
                                 dimensionality = 6,
                                 mixingProportions = c(0.79, 0.21),
                                 mu = rbind(trueMu1, trueMu2),
                                 sigma = rbind(trueSigma1, trueSigma2),
                                 produceImage = "No")

# Clustering
mplnResults <- MPLNClust::mplnVariational(dataset = sampleData$dataset,
                                membership = sampleData$trueMembership,
                                gmin = 1,
                                gmax = 2,
                                initMethod = "kmeans",
                                nInitIterations = 2,
                                normalize = "Yes")

# Model selection
AIC3model <- MPLNClust::AIC3Function(logLikelihood = mplnResults$logLikelihood,
                           nParameters = mplnResults$numbParameters,
                           clusterRunOutput = mplnResults$allResults,
                           gmin = mplnResults$gmin,
                           gmax = mplnResults$gmax,
                           parallel = FALSE)


anjalisilva/MPLNClust documentation built on Jan. 28, 2024, 11:02 a.m.