mcclust.ext-package: Point estimation and credible balls for Bayesian cluster...

Description Details Author(s) References See Also Examples

Description

This is an extension of mcclust package. It provides post-processing tools for MCMC samples of partitions to summarize the posterior in Bayesian clustering models. Functions for point estimation are provided, giving a single representative clustering of the posterior. And, to characterize uncertainty in the point estimate, credible balls can be computed.

Details

Package: mcclust.ext
Type: Package
Version: 1.0
Date: 2015-03-24
License: GPL (>= 2)

Most important functions:

The functions minVI and minbinder.ext find a point estimate of the clustering by minimizing the posterior expected Variation of Information and Binder's loss, respectively. The function minbinder.ext extends minbinder by providing a greedy search optimization method to find the optimal clustering. The function minVI provides several optimization methods to find the optimal clustering. For computational reasons, the lower bound to the posterior expected Variation of Information from Jensen's inequality is minimized.

The function credibleball computes a credible ball around the clustering estimate to characterize uncertainty. It returns the upper vertical, lower vertical, and horizontal bounds to describe the credible ball.

The function plotpsm produces a heat map of the posterior similarity matrix.

Author(s)

Sara Wade

Maintainer: Sara Wade <sara.wade@eng.cam.ac.uk>

References

Binder, D.A. (1978) Bayesian cluster analysis, Biometrika 65, 31–38.

Fritsch, A. and Ickstadt, K. (2009) An improved criterion for clustering based on the posterior similarity matrix, Bayesian Analysis, 4,367–391.

Lau, J.W. and Green, P.J. (2007) Comparing clusters–an information based distance procedures, Journal of Computational and Graphical Statistics 16, 526–558.

Meila, M. (2007) Bayesian model based clustering procedures, Journal of Multivariate Analysis 98, 873–895.

Wade, S. and Ghahramani, Z. (2015) Bayesian cluster analysis: Point estimation and credible balls. Submitted. arXiv:1505.03339.

See Also

mcclust

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
data(galaxy.fit)
x=data.frame(x=galaxy.fit$x)
data(galaxy.pred)
data(galaxy.draw)

# Find representative partition of posterior
# Variation of Information (minimizes lower bound to VI)
psm=comp.psm(galaxy.draw)
galaxy.VI=minVI(psm,galaxy.draw,method=("all"),include.greedy=TRUE)
summary(galaxy.VI)
plot(galaxy.VI,data=x,dx=galaxy.fit$fx,xgrid=galaxy.pred$x,dxgrid=galaxy.pred$fx)
# Compute Variation of Information
VI(galaxy.VI$cl,galaxy.draw)
# Binder
galaxy.B=minbinder.ext(psm,galaxy.draw,method=("all"),include.greedy=TRUE)
summary(galaxy.B)
plot(galaxy.B,data=x,dx=galaxy.fit$fx,xgrid=galaxy.pred$x,dxgrid=galaxy.pred$fx)

# Uncertainty in partition estimate
galaxy.cb=credibleball(galaxy.VI$cl[1,],galaxy.draw)
summary(galaxy.cb)
plot(galaxy.cb,data=x,dx=galaxy.fit$fx,xgrid=galaxy.pred$x,dxgrid=galaxy.pred$fx)

# Compare with uncertainty in heat map of posterior similarity matrix
plotpsm(psm)

muschellij2/mcclust.ext documentation built on May 26, 2019, 9:36 a.m.