greedy: Optimizes the posterior expected loss with the greedy search...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Finds a representative partition of the posterior by minimizing the posterior expected loss with possible loss function of Binder's loss, the Variation of Information, and the modified Variation of Information through a greedy search algorithm.

Usage

1
2
greedy(psm, cls.draw = NULL, loss = NULL, start.cl = NULL,
maxiter = NULL, L = NULL, suppress.comment = TRUE)

Arguments

psm

a posterior similarity matrix, which can be obtained from MCMC samples of clusterings through a call to comp.psm.

cls.draw

a matrix of the MCMC samples of clusterings of the ncol(cls) data points that have been used to compute psm. Note: cls.draw has to be provided if loss="VI".

loss

the loss function used. Should be one of "Binder", "VI", or "VI.lb". Defaults to "VI.lb".

start.cl

clustering used as starting point. If NULL start.cl= 1:nrow(psm) is used.

maxiter

integer, maximum number of iterations. Defaults to 2*nrow(psm).

L

integer, specifies the number of local partitions considered at each iteration. Defaults to 2*nrow(psm).

suppress.comment

logical, for method="greedy", prints a description of the current state (iteration number, number of clusters, posterior expected loss) at each iteration if set to FALSE. Defaults to TRUE.

Details

This function is called by minVI and minbinder.ext to optimize the posterior expected loss via a greedy search algorithm. Possible loss functions include Binder's loss ("Binder") and the Variation of Information ("VI"). As computation of the posterior expected Variation of Information is expensive, a third option ("VI.lb") is to minimize a modified Variation of Information by swapping the log and expectation. From Jensen's inequality, this can be viewed as minimizing a lower bound to the posterior expected Variation of Information.

At each iteration of the algorithm, we consider the L closest ancestors or descendants and move in the direction of minimum posterior expected; the distance is measured by Binder's loss or the Variation of Information, depending on the choice of loss. We recommend trying different starting locations cl.start and values of l that control the amount of local exploration. A description of the algorithm at every iteration is printed if suppress.comment=FALSE.

Value

cl

clustering with minimal value of expected loss.

value

value of posterior expected loss.

iter.greedy

the number of iterations the method needed to converge.

Author(s)

Sara Wade, sara.wade@eng.cam.ac.uk

References

Wade, S. and Ghahramani, Z. (2015) Bayesian cluster analysis: Point estimation and credible balls. Submitted. arXiv:1505.03339.

See Also

minVI or minbinder.ext which call greedy to find the point estimate that minimizes the posterior expected loss.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data(ex1.data)
x=ex1.data[,c(1,2)]
cls.true=ex1.data$cls.true
plot(x[,1],x[,2],xlab="x1",ylab="x2")
k=max(cls.true)
for(l in 2:k){
points(x[cls.true==l,1],x[cls.true==l,2],col=l)}

# Find representative partition of posterior
data(ex1.draw)
psm=comp.psm(ex1.draw)
ex1.VI=minVI(psm,method=("greedy"),suppress.comment=FALSE)
summary(ex1.VI)
# Different initlization
ex1.VI.v2=minVI(psm,method=("greedy"),suppress.comment=FALSE,start.cl=ex1.draw[nrow(ex1.draw),])
summary(ex1.VI.v2)

muschellij2/mcclust.ext documentation built on May 26, 2019, 9:36 a.m.