Description Usage Arguments Details Value Note Author(s) References See Also Examples
Calculates and plots a set of model selection criteria (depending on the underlying model: e.g. BIC, adjusted BIC, DIC – Deviance Information Criterion, AWE – Approximate Weight of Evidence, CLC – Classification Likelihood Criteria, ICL – Integrated Classification Likelihood, ICL-BIC) for all estimated models produced by one and the same cluster method (for the sake of comparability) and for various numbers H of clusters/groups and several independent MCMC runs saved in output files located in the specified directory. Therefore several maximisation methods are available. For more information about the criteria see Details, References and references therein.
1 2 3 4 5 6 7 8 9 10 11 | calcMSCritMCC(workDir, myLabel = "model choice for ...", H0 = 3,
whatToDoList = c("approxMCL", "approxML", "postMode"))
calcMSCritMCCExt(workDir, NN, myLabel = "model choice for ...",
ISdraws = 3, H0 = 3,
whatToDoList = c("approxMCL", "approxML", "postMode"))
calcMSCritDMC(workDir, myLabel = "model choice for ...",
myN0 = "N0 = ...",
whatToDoList = c("approxMCL", "approxML", "postMode"))
calcMSCritDMCExt(workDir, myLabel = "model choice for ...",
myN0 = "N0 = ...",
whatToDoList = c("approxMCL", "approxML", "postMode"))
|
workDir |
A character giving the name (or full path) of the directory containing the output files of the estimated models produced by one and the same cluster method (for the sake of comparability) for which model selection criteria have to be calculated. |
NN |
Number of individuals N (just for argument/parameter checks). |
myLabel |
Specifies (part of) labeling of the plots. |
myN0 |
A character documenting the value of |
H0 |
Number of 'expected' clusters/groups by user. Necessary for the calculation of the model prior adjusted BIC. See Details. |
ISdraws |
Number of draws for the importance sampling step to approximate the logICL. |
whatToDoList |
A character vector containing a subset of |
For each maximisation method in whatToDoList
all (available) model selection criteria are calculated (in an
iterative manner). Depending on the entries in this list (whatToDoList
) the calculation of (all) these
criteria is based on the MCMC draws (iteration) corresponding to the maximum of the log classification likelihood
("approxMCL"
), log likelihood ("approxML"
) and/or (for the sake of completeness) log posterior density
("postMode"
).
Note, that the user has to decide which criteria are admissible.
Which criteria needs which maximisation method? The AWE and the logICL are based on the maximum of the (log) classification likelihood, all the others on the maximum of the (log) likelihood (see References).
By the way, it internally calculates the log-likelihood and related values such as LK
(observed
log-likelihood), CLK
(classification or complete log-likelihood), CK
(classification-type
log-likelihood), EK
(entropy term) as well as d_h (number of parameters) which are essential parts of the
model selection criteria.
We calculate the model prior adjusted BIC using adjBIC = BIC - 2*H*log(H0) + 2*logΓ(H + 1) + 2*H0.
According to the used model type the following criteria are calculated: Bic, adjusted Bic, Aic, Awe, IclBic, Clc,
Dic2, Dic4 and logICL (see References). Furthermore, plots and tables of selected critera are generated (and
plots are also saved in directory workDir
).
To document the iteration progress, some information is recorded for each output file (containing an MCMC run) – depending on maximisation method – like: a running number, maximisation method, number of cluster/groups, BIC, adjusted BIC, AIC, AWE, CLC, IclBic, DIC2, DIC4a, ICL and additionally adj Rand (which compares the starting with the final allocation).
For each entry in whatToDo
a matrix MSCritTable
is produced. Each row represents a processed output
file (containing an MCMC run) and the colums contain:
H
number of clusters/groups
mMax
number/position of the MCMC draw/iteration leading to the maximum value of the (log-)posterior
density or (classification) log-likelihood (depending on whatToDo
)
which is calculated for each MCMC draw
maxLPD
the maximum value of the (log-)posterior density itself, only if whatToDo
includes
"postMode"
– corresponding to
the posterior mode
maxLL
the maximum value of the log-likelihood itself, only if whatToDo
includes
"approxML"
– corresponding to
the 'approximate maximum likelihood'
maxLCL
the maximum value of the classification log-likelihood itself, only if whatToDo
includes
"approxMCL"
– corresponding to
the 'approximate maximum classification likelihood'
BIC
Bayesian Information Criterion (Schwarz Criterion)
adjBIC
adjusted BIC – Note: not available/implemented for DMC[Ext]!
AIC
Akaike Information Criterion
AWE
Approximate Weight of Evidence, see Banfield and Raftery (1993)
CLC
Classification Likelihood Criterion
IclBic
Integrated Classification Likelihood-BIC
DIC2
Deviance Information Criterion (DIC2), see Fruehwirth-Schnatter and Pyne (2010) and Fruehwirth-Schnatter et al. (2011) – Note: not available/implemented for DMC!
DIC4a
Deviance Information Criterion (DIC4a), see Fruehwirth-Schnatter and Pyne (2010) and Fruehwirth-Schnatter et al. (2011) – Note: not available/implemented for DMC!
logICL
log Integrated Classification Likelihood – Note: not available/implemented for DMC[Ext]!
adjRand
adjusted Rand-Index for (estimated) group membership VS starting values Initial$S.i.start
(only if not NULL
)
For each entry in whatToDo
the corresponding MSCritTable
is printed together with the current working
directory and the content of the current whatToDo
. Further, plots of the model selection criteria are produced
and saved (with type eps
and pdf
).
If MCCExt is considered also the number of importance sampling draws ISdraws
(necessary for logICL) is
printed.
Additionally, after each iteration the workspace containing the model selection criteria and other stuff is saved to
a .RData-file via save.image
within directory workDir
.
Finally, a list containing the names of the processed output files (each containing an MCMC run) is printed.
A list containing:
postMode |
the corresponding |
approxML |
the corresponding |
approxMCL |
the corresponding |
ISdraws |
the number of importance sampling draws for approximating logICL (only for MCCExt) |
outFileNames |
a list (character vector) containing the names of the processed output files (each containing an MCMC run) |
Note, that the user has to decide which criteria are admissible.
Note, that in contrast to the literature (see References), the numbering (labelling) of the states of the categorical outcome variable (time series) in this package is sometimes 0,...,K (instead of 1,...,K), however, there are K+1 categories (states)!
Christoph Pamminger <christoph.pamminger@gmail.com>
Jeffrey D. Banfield and Adrian E. Raftery, (1993), "Model-Based Gaussian and Non-Gaussian Clustering". Biometrics, Vol. 49, No. 3, pp. 803-821. http://www.jstor.org/stable/2532201
Sylvia Fruehwirth-Schnatter, Christoph Pamminger, Andrea Weber and Rudolf Winter-Ebmer, (2011), "Labor market entry and earnings dynamics: Bayesian inference using mixtures-of-experts Markov chain clustering". Journal of Applied Econometrics. DOI: 10.1002/jae.1249 http://onlinelibrary.wiley.com/doi/10.1002/jae.1249/abstract
Sylvia Fruehwirth-Schnatter and Saumyadipta Pyne, (2010), "Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions". Biostatistics, Vol. 11, No. 2, pp. 317-336. DOI: 10.1093/biostatistics/kxp062 http://biostatistics.oxfordjournals.org/content/11/2/317.full.pdf+html
Christoph Pamminger and Sylvia Fruehwirth-Schnatter, (2010), "Model-based Clustering of Categorical Time Series". Bayesian Analysis, Vol. 5, No. 2, pp. 345-368. DOI: 10.1214/10-BA606 http://ba.stat.cmu.edu/journal/2010/vol05/issue02/pamminger.pdf
classAgreement
, savePlot
,
mcClust
, dmClust
, mcClustExtended
, dmClustExtended
1 2 | # please run the examples in mcClust, dmClust, mcClustExtended,
# dmClustExtended
|
Loading required package: gplots
Attaching package: 'gplots'
The following object is masked from 'package:stats':
lowess
Loading required package: xtable
Loading required package: mnormt
Loading required package: MASS
Loading required package: bayesm
Loading required package: boa
Loading required package: e1071
Loading required package: gtools
Attaching package: 'gtools'
The following object is masked from 'package:e1071':
permutations
The following object is masked from 'package:bayesm':
rdirichlet
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.