mpost.spd: Median Posterior for Subset Posterior Samples in SPD manifold
In SBmedian: Scalable Bayes with Median of Subset Posteriors

Description Usage Arguments Value Examples

View source: R/mpost.spd.R

SPD manifold is a collection of matrices that are symmetric and positive-definite and it is well known that using Euclidean geometry for data on the manifold is rather inaccurate. Here, we propose a function for dealing with SPD matrices specifically where valid examples include full-rank covariance and precision matrices. Note that N_M = ∑_{m=1}^M n_m.

mpost.spd(
  splist,
  sigma = 0.1,
  maxiter = 121,
  abstol = 1e-06,
  show.progress = FALSE
)

`splist`	a list of length M containing (p\times p) matrix or 3d array of size (p\times p\times n_m) whose slices are SPD matrices from subset posterior samples respectively.
`sigma`	bandwidth parameter for RBF kernel.
`maxiter`	maximum number of iterations for Weiszfeld algorithm.
`abstol`	stopping criterion for Weiszfeld algorithm.
`show.progress`	a logical; `TRUE` to show iteration mark, `FALSE` otherwise.

a named list containing:

med.atoms: a (p\times p\times N_M) 3d array whose slices are atoms aggregated.
med.weights: a weight vector that sums to 1 corresponding to med.atoms.
weiszfeld.weights: a weight for M subset posteriors.
weiszfeld.history: updated parameter values. Each row is for iteration, while columns are weights corresponding to weiszfeld.weights.

## Median Posteior from 5-dimension Wishart distribution
## Visualization will be performed for distribution of larget eigenvalue
## where RED is for estimated density and BLUE is density from all samples.

#  Step 1. let's build a list of atoms whose numbers differ
set.seed(8128)                   # for reproducible results
mydata = list()
mydata[[1]] = stats::rWishart(96, df=10, Sigma=diag(5))
mydata[[2]] = stats::rWishart(78, df=10, Sigma=diag(5))
mydata[[3]] = stats::rWishart(65, df=10, Sigma=diag(5))
mydata[[4]] = stats::rWishart(77, df=10, Sigma=diag(5))

#  Step 2. Let's run the algorithm
myrun = mpost.spd(mydata, show.progress=TRUE)

#  Step 3. Compute largest eigenvalues for the samples
eig4 = list()
for (i in 1:4){
  spdmats = mydata[[i]]        # SPD atoms
  spdsize = dim(spdmats)[3]    # number of atoms
  eigvals = rep(0,spdsize)     # compute largest eigenvalues
  for (j in 1:spdsize){
    eigvals[j] = max(base::eigen(spdmats[,,j])$values)
  }
  eig4[[i]] = eigvals
}
eigA   = unlist(eig4)
eiglim = c(min(eigA), max(eigA))

#  Step 4. Visualize
#  4-1. show distribution of subset posterior samples' eigenvalues
opar <- par(no.readonly=TRUE)
par(mfrow=c(2,3))
for (i in 1:4){
  hist(eig4[[i]], main=paste("subset", i), xlab="largest eigenvalues", 
       prob=TRUE, xlim=eiglim, ylim=c(0,0.1))
  lines(stats::density(eig4[[i]]), lwd=1, col="red")
  lines(stats::density(eigA),      lwd=1, col="blue")
}

#  4-2. 250 median posterior samples via importance sampling
id250 = base::sample(1:length(eigA), 250, prob=myrun$med.weights, replace=TRUE)
sp250 = eigA[id250]
hist(sp250, main="median samples", xlab="largest eigenvalues", 
     prob=TRUE, xlim=eiglim, ylim=c(0,0.1))
lines(stats::density(sp250), lwd=1, col="red")
lines(stats::density(eigA),  lwd=1, col="blue")

#  4-3. convergence over iterations
matplot(myrun$weiszfeld.history, xlab="iteration", ylab="value",
        type="b", main="convergence of weights")
par(opar)