mcclust-package | R Documentation |
Implements methods for processing a sample of (hard) clusterings, e.g. the MCMC output of a Bayesian clustering model. Among them are methods that find a single best clustering to represent the sample, which are based on the posterior similarity matrix or a relabelling algorithm.
Package: | mcclust |
Type: | Package |
Version: | 1.0 |
Date: | 2009-03-12 |
License: | GPL (>= 2) |
LazyLoad: | yes |
Most important functions:
comp.psm
for computing posterior similarity matrix (PSM). Based on the PSM maxpear
and minbinder
provide
several optimization methods to find a clustering with maximal posterior expected adjusted Rand index with the true clustering or
one that minimizes the posterior expectation of a loss function by Binder (1978). minbinder
provides the optimization algorithm of
Lau and Green.
relabel
contains the relabelling algorithm of Stephens (2000).
arandi
and vi.dist
compute distance functions for clusterings, the (adjusted) Rand index and the entropy-based variation of
information distance.
Arno Fritsch
Maintainer: Arno Fritsch <arno.fritsch@tu-dortmund.de>
Binder, D.A. (1978) Bayesian cluster analysis, Biometrika 65, 31–38.
Fritsch, A. and Ickstadt, K. (2009) An improved criterion for clustering based on the posterior similarity matrix, Bayesian Analysis, accepted.
Lau, J.W. and Green, P.J. (2007) Bayesian model based clustering procedures, Journal of Computational and Graphical Statistics 16, 526–558.
Stephens, M. (2000) Dealing with label switching in mixture models. Journal of the Royal Statistical Society Series B, 62, 795–809.
data(cls.draw2) # sample of 500 clusterings from a Bayesian cluster model tru.class <- rep(1:8,each=50) # the true grouping of the observations psm2 <- comp.psm(cls.draw2) # posterior similarity matrix # optimize criteria based on PSM mbind2 <- minbinder(psm2) mpear2 <- maxpear(psm2) # Relabelling k <- apply(cls.draw2,1, function(cl) length(table(cl))) max.k <- as.numeric(names(table(k))[which.max(table(k))]) relab2 <- relabel(cls.draw2[k==max.k,]) # compare clusterings found by different methods with true grouping arandi(mpear2$cl, tru.class) arandi(mbind2$cl, tru.class) arandi(relab2$cl, tru.class)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.