Mixtures-of-Experts Markov Chain Clustering and Dirichlet Multinomial Clustering
This package provides various Markov Chain Monte Carlo (MCMC) samplers for model-based clustering of discrete-valued time series obtained by observing a categorical variable with several states (in a Bayesian approach). These methods are based on finite mixtures of first-order time-homogeneous Markov chain (models) with unknown transition matrices. In the Markov chain clustering approach the individual transition probabilities are fixed to a group-specific transition matrix. In the second approach called Dirichlet multinomial clustering it is assumed that within each group unobserved heterogeneity is still existent and is captured by allowing the individual transition matrices to deviate from the group means by describing this variation for each row through a Dirichlet distribution with unknown hyperparameters. Further, in order to analyze group membership, we provide also an extension to these approaches by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule using a multinomial logit model. In other words, unobserved group membership is modeled as a multinomial logit model which allows for dependence on individual-specific and other characteristics. Additionally, functions to process the results are provided.
The main functions are
mcClust for Markov Chain Clustering and
dmClust for Dirichlet
Multinomial Clustering as well as
dmClustExtended which also include
the mixtures-of-experts extension. These functions use a special structure of the data (see
Njk.i in the
Examples therein and/or e.g.
dataFrameToNjki are provided to help preparing the data (see examples
therein). Additionally, a function
MNLAuxMix is provided for multinomial logit regression using the
auxiliary mixture approach (see References). Note that also prior information may be incorporated as these
methods are “Bayesian” approaches. Thus, to estimate the parameters such as transition probabilities,
regression coefficients or mixing proportions, MCMC algorithms are used. For more details about the models and
estimation procedures see References. The results are returned in lists and also saved to output files. To
process the results some more functions are provided to analyse and visualise the results; so for example the
(group-specific) transition probabilities can be visualised with
plotTransProbs. Finally, also some
well-known model selection criteria can be calculated with
Note, that in contrast to the literature (see References), the numbering (labelling) of the states of the categorical outcome variable (time series) in this package is sometimes 0,...,K (instead of 1,...,K), however, there are K+1 categories (states)!
Christoph Pamminger <email@example.com>
Maintainer: Christoph Pamminger <firstname.lastname@example.org>
Sylvia Fruehwirth-Schnatter, Christoph Pamminger, Andrea Weber and Rudolf Winter-Ebmer, (2011), "Labor market entry and earnings dynamics: Bayesian inference using mixtures-of-experts Markov chain clustering". Journal of Applied Econometrics. DOI: 10.1002/jae.1249 http://onlinelibrary.wiley.com/doi/10.1002/jae.1249/abstract
Christoph Pamminger and Sylvia Fruehwirth-Schnatter, (2010), "Model-based Clustering of Categorical Time Series". Bayesian Analysis, Vol. 5, No. 2, pp. 345-368. DOI: 10.1214/10-BA606 http://ba.stat.cmu.edu/journal/2010/vol05/issue02/pamminger.pdf
Sylvia Fruehwirth-Schnatter and Rudolf Fruehwirth, (2010), "Data augmentation and MCMC for binary and multinomial logit models". In T. Kneib and G. Tutz (eds): Statistical Modelling and Regression Structures: Festschrift in Honour of Ludwig Fahrmeir. Physica Verlag, Heidelberg, pp. 111-132. DOI: 10.1007/978-3-7908-2413-1_7 http://www.springerlink.com/content/t4h810017645wh68/. See also: IFAS Research Paper Series 2010-48 (http://www.jku.at/ifas/content/e108280/e108491/e108471/e109880/ifas_rp48.pdf).
1 2 3
# please run the examples in mcClust, dmClust, mcClustExtended, # dmClustExtended, MNLAuxMix