mpost.euc: Median Posterior for Subset Posterior Samples in Euclidean...
In SBmedian: Scalable Bayes with Median of Subset Posteriors

Description Usage Arguments Value References Examples

View source: R/mpost.euc.R

mpost.euc is a general framework to merge multiple empirical measures Q_1,Q_2,…,Q_M \subset R^p from independent subset of data by finding a median

\hat{Q} = \textrm{argmin}_Q ∑_{m=1}^M d(Q,Q_m)

where Q is a weighted combination and d(P_1,P_2) is distance in RKHS between two empirical measures P_1 and P_2. As in the references, we use RBF kernel with bandwidth parameter σ.

mpost.euc(
  splist,
  sigma = 0.1,
  maxiter = 121,
  abstol = 1e-06,
  show.progress = FALSE
)

`splist`	a list of length M containing vectors or matrices of univariate or multivariate subset posterior samples respectively.
`sigma`	bandwidth parameter for RBF kernel.
`maxiter`	maximum number of iterations for Weiszfeld algorithm.
`abstol`	stopping criterion for Weiszfeld algorithm.
`show.progress`	a logical; `TRUE` to show iteration mark, `FALSE` otherwise.

a named list containing:

med.atoms: a vector or matrix of all atoms aggregated.
med.weights: a weight vector that sums to 1 corresponding to med.atoms.
weiszfeld.weights: a weight for M subset posteriors.
weiszfeld.history: updated parameter values. Each row is for iteration, while columns are weights corresponding to weiszfeld.weights.

\insertRef

minsker_scalable_2014SBmedian

\insertRef

minsker_robust_2017SBmedian

## Median Posteior from 2-D Gaussian Samples
#  Step 1. let's build a list of atoms whose numbers differ
set.seed(8128)                   # for reproducible results
mydata = list()
mydata[[1]] = cbind(rnorm(96, mean= 1), rnorm(96, mean= 1))
mydata[[2]] = cbind(rnorm(78, mean=-1), rnorm(78, mean= 0))
mydata[[3]] = cbind(rnorm(65, mean=-1), rnorm(65, mean= 1))
mydata[[4]] = cbind(rnorm(77, mean= 2), rnorm(77, mean=-1))

#  Step 2. Let's run the algorithm
myrun = mpost.euc(mydata, show.progress=TRUE)

#  Step 3. Visualize
#  3-1. show subset posterior samples
opar <- par(no.readonly=TRUE)
par(mfrow=c(2,3), no.readonly=TRUE)
for (i in 1:4){
  plot(mydata[[i]], cex=0.5, col=(i+1), pch=19, xlab="", ylab="", 
       main=paste("subset",i), xlim=c(-4,4), ylim=c(-3,3))
}

#  3-2. 250 median posterior samples via importance sampling
id250 = base::sample(1:nrow(myrun$med.atoms), 250, prob=myrun$med.weights, replace=TRUE)
sp250 = myrun$med.atoms[id250,]
plot(sp250, cex=0.5, pch=19, xlab="", ylab="", 
     xlim=c(-4,4), ylim=c(-3,3), main="median samples")

#  3-3. convergence over iterations
matplot(myrun$weiszfeld.history, xlab="iteration", ylab="value",
        type="b", main="convergence of weights")
par(opar)