greedy: Optimizes the posterior expected loss with the greedy search...
In muschellij2/mcclust.ext: Point estimation and credible balls for Bayesian cluster analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

Finds a representative partition of the posterior by minimizing the posterior expected loss with possible loss function of Binder's loss, the Variation of Information, and the modified Variation of Information through a greedy search algorithm.

1 2	greedy(psm, cls.draw = NULL, loss = NULL, start.cl = NULL, maxiter = NULL, L = NULL, suppress.comment = TRUE)

`psm`	a posterior similarity matrix, which can be obtained from MCMC samples of clusterings through a call to `comp.psm`.
`cls.draw`	a matrix of the MCMC samples of clusterings of the `ncol(cls)` data points that have been used to compute `psm`. Note: `cls.draw` has to be provided if `loss="VI"`.
`loss`	the loss function used. Should be one of `"Binder"`, `"VI"`, or `"VI.lb"`. Defaults to `"VI.lb"`.
`start.cl`	clustering used as starting point. If `NULL` `start.cl= 1:nrow(psm)` is used.
`maxiter`	integer, maximum number of iterations. Defaults to `2*nrow(psm)`.
`L`	integer, specifies the number of local partitions considered at each iteration. Defaults to `2*nrow(psm)`.
`suppress.comment`	logical, for `method="greedy"`, prints a description of the current state (iteration number, number of clusters, posterior expected loss) at each iteration if set to FALSE. Defaults to TRUE.

This function is called by minVI and minbinder.ext to optimize the posterior expected loss via a greedy search algorithm. Possible loss functions include Binder's loss ("Binder") and the Variation of Information ("VI"). As computation of the posterior expected Variation of Information is expensive, a third option ("VI.lb") is to minimize a modified Variation of Information by swapping the log and expectation. From Jensen's inequality, this can be viewed as minimizing a lower bound to the posterior expected Variation of Information.

At each iteration of the algorithm, we consider the L closest ancestors or descendants and move in the direction of minimum posterior expected; the distance is measured by Binder's loss or the Variation of Information, depending on the choice of loss. We recommend trying different starting locations cl.start and values of l that control the amount of local exploration. A description of the algorithm at every iteration is printed if suppress.comment=FALSE.

`cl`	clustering with minimal value of expected loss.
`value`	value of posterior expected loss.
`iter.greedy`	the number of iterations the method needed to converge.

Sara Wade, sara.wade@eng.cam.ac.uk

Wade, S. and Ghahramani, Z. (2015) Bayesian cluster analysis: Point estimation and credible balls. Submitted. arXiv:1505.03339.

minVI or minbinder.ext which call greedy to find the point estimate that minimizes the posterior expected loss.

data(ex1.data)
x=ex1.data[,c(1,2)]
cls.true=ex1.data$cls.true
plot(x[,1],x[,2],xlab="x1",ylab="x2")
k=max(cls.true)
for(l in 2:k){
points(x[cls.true==l,1],x[cls.true==l,2],col=l)}

# Find representative partition of posterior
data(ex1.draw)
psm=comp.psm(ex1.draw)
ex1.VI=minVI(psm,method=("greedy"),suppress.comment=FALSE)
summary(ex1.VI)
# Different initlization
ex1.VI.v2=minVI(psm,method=("greedy"),suppress.comment=FALSE,start.cl=ex1.draw[nrow(ex1.draw),])
summary(ex1.VI.v2)