minVI: Minimize the posterior expected Variation of Information

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Finds a representative partition of the posterior by minimizing the lower bound to the posterior expected Variation of Information from Jensen's Inequality.

Usage

1
2
3
minVI(psm, cls.draw=NULL, method=c("avg","comp","draws","greedy","all"), 
      max.k=NULL, include.greedy=FALSE, start.cl=NULL, maxiter=NULL,
      l=NULL, suppress.comment=TRUE)

Arguments

psm

a posterior similarity matrix, which can be obtained from MCMC samples of clusterings through a call to comp.psm.

cls.draw

a matrix of the MCMC samples of clusterings of the ncol(cls) data points that have been used to compute psm. Note: cls.draw has to be provided if method="draw" or "all".

method

the optimization method used. Should be one of "avg", "comp", "draws", "greedy" or "all". Defaults to "avg".

max.k

integer, if method="avg" or "comp" the maximum number of clusters up to which the hierarchical clustering is cut. Defaults to ceiling(nrow(psm)/4).

include.greedy

logical, should method "greedy" be included when method="all"? Defaults to FALSE.

start.cl

clustering used as starting point for method="greedy". If NULL start.cl= 1:nrow(psm) is used.

maxiter

integer, maximum number of iterations for method="greedy". Defaults to 2*nrow(psm).

l

integer, specifies the number of local partitions considered at each iteration for method="greedy". Defaults to 2*nrow(psm).

suppress.comment

logical, for method="greedy", prints a description of the current state (iteration number, number of clusters, posterior expected loss) at each iteration if set to FALSE. Defaults to TRUE.

Details

The Variation of Information between two clusterings is defined as the sum of the entropies minus two times the mutual information. Computation of the posterior expected Variation of Information can be expensive, as it requires a Monte Carlo estimate. We consider a modified posterior expected Variation of Information, obtained by swapping the log and expectation, which is much more computationally efficient as it only depends on the posterior through the posterior similarity matrix. From Jensen's inequality, the problem can be viewed as minimizing a lower bound to the posterior expected loss.

We provide several optimization methods. For method="avg" and "comp", the search is restricted to the clusterings obtained from a hierarchical clustering with average/complete linkage and 1-psm as a distance matrix (the clusterings with number of clusters 1:max.k are considered).
Method "draws" restricts the search to the clusterings sampled in the MCMC algorithm.
Method "greedy" implements a greedy search algorithm, where at each iteration, we consider the l closest ancestors or descendants and move in the direction of minimum posterior expected loss with the VI distance. We recommend trying different starting locations cl.start and values of l that control the amount of local exploration. Depending on the starting location and l, the method can take some time to converge, thus it is only included in method="all" if include.greedy=TRUE. If method="all", the starting location cl.start defaults to the best clustering found by the other methods. A description of the algorithm at every iteration is printed if suppress.comment=FALSE. If method="all" all minimization methods except "greedy" are applied by default.

Value

cl

clustering with minimal value of expected loss. If method="all" a matrix containing the clustering with the smallest value of the expected loss over all methods in the first row and the clusterings of the individual methods in the next rows.

value

value of posterior expected loss. A vector corresponding to the rows of cl if method="all".

method

the optimization method used.

iter.greedy

if method="greedy" or method="all" and include.greedy=T the number of iterations the method needed to converge.

Author(s)

Sara Wade, sara.wade@eng.cam.ac.uk

References

Meila, M. (2007) Bayesian model based clustering procedures, Journal of Multivariate Analysis 98, 873–895.

Wade, S. and Ghahramani, Z. (2015) Bayesian cluster analysis: Point estimation and credible balls. Submitted. arXiv:1505.03339

See Also

summary.c.estimate and plot.c.estimate to summarize and plot the resulting output from minVI or minbinder.ext; VI or VI.lb for computing the posterior expected Variation of Information or the modified version from swapping the log and expectation; comp.psm for computing posterior similarity matrix; maxpear, minbinder.ext, and medv for other point estimates of clustering based on posterior; and credibleball to compute credible ball characterizing uncertainty around the point estimate.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
data(ex2.data)
x=data.frame(ex2.data[,c(1,2)])
cls.true=ex2.data$cls.true
plot(x[,1],x[,2],xlab="x1",ylab="x2")
k=max(cls.true)
for(l in 2:k){
points(x[cls.true==l,1],x[cls.true==l,2],col=l)}

# Find representative partition of posterior
data(ex2.draw)
psm=comp.psm(ex2.draw)
ex2.VI=minVI(psm,ex2.draw,method=("all"),include.greedy=TRUE)
summary(ex2.VI)
plot(ex2.VI,data=x)

# Compare with Binder
ex2.B=minbinder.ext(psm,ex2.draw,method=("all"),include.greedy=TRUE)
summary(ex2.B)
plot(ex2.B,data=x)

muschellij2/mcclust.ext documentation built on May 26, 2019, 9:36 a.m.