msBP.Gibbs: Gibbs sampling for density estimation for msBP model

View source: R/msBP.Gibbs.R

msBP.GibbsR Documentation

Gibbs sampling for density estimation for msBP model

Description

Gibbs sampling for Markov Chain Motecarlo sampling from the posterior distribution of an msBP model.

Usage

msBP.Gibbs(x, a, b, g0 = "normal", g0par=c(0,1), mcmc, 
	grid = list(n.points=40, low=0.001, upp=0.999), state=NULL, hyper, 
	printing=0, maxScale=5, ...)

Arguments

x

the observed sample

a

scalar a parameter

b

scalar b parameter

g0

prior guess for the density of x. Currently only "normal", "unif", "gamma", and "empirical" are supported. From version 1.1 random paramters are also allowed (only with g0="normal").

g0par

additional scalar parameters for g0. If "normal" corresponds to mean and standard deviation, if "uniform" to upper and lower bounds, if "gamma" to shape and rate parameters. If "empirical" this value is not used. From version 1.1 random paramters are also allowed (only with g0="normal").

mcmc

a list giving the MCMC parameters. It must include the following integers: nb giving the number of burn-in iterations, nrep giving the total number of iterations (including nb), and ndisplay giving the multiple of iterations to be displayed on screen while the C++ routine is running (a message will be printed every ndisplay iterations).

grid

a list giving the parameters for plotting the posterior mean density over a finite grid. It must include the following values: low and upp giving the lower and upper bound respectively of the grid and n.points, an integer giving the number of points of the grid

state

a list giving the current value of the parameters. This list is used if the current analysis is the continuation of a previous analysis or if we want to start the MCMC algorithm from some particular value of the parameters.

hyper

a list containing the values of the hyperparameters for a and b or for the parameters of the prior guess (only if g0="normal") . It must contains hyperprior, a list of three logical values determining if hyperpriors for a, b and g0 are used (TRUE) or if a, b, or g0 are fixed (FALSE), and hyperpar a list containing the hyperparameters for the hyperprior distributions: beta, gamma, delta, lambda, mu0, kappa0, alpha0, and beta0. See details. gridB is a grid of values for which the prior (and posterior) for b is evaluated with a Griddy Gibbs approach (Ritter and Tanner, 1992). See details.

printing

Vector of integers if the internal C++ function need to print what is doing

maxScale

maximum scale of the binary trees.

...

additional arguments.

Details

Before calling the proper C++ subrouting the function center the sample on an initial guess for the density of the data. If g0 = 'empirical' the data are transformed so that the expctation of the msBP prior is centered on the kernel density estimate of x.

The algorithm consists of two primary steps: (i) allocate each observation to a multiscale cluster, conditionally on the values of the weights (see also msBP.postCluster); (ii) update the weights, conditionally on the cluster allocations. All the procedure is written in C++ and additional R scripts are used to pre- and post-process the data and the output.

If hyper$hyperpriors$a or hyper$hyperpriors$b is true, additional hyperpriors for a and b are assumed. Specifically the algorithm implements a \sim Ga(\beta,\gamma) and b \sim Ga(\delta, \lambda). For the former parameter the full conditional posterior distribution is available in closed form, i.e.

a | - \sim Ga\left(\beta + 2^{s'+1} - 1, \gamma - \sum_{s=0}^{s'} \sum_{h=1}^{2^s} \log(1-S_{s,h}) \right),

while for the latter its full conditional posterior is proportional to

\frac{b^{\delta-1}}{B(b,b)^{2^{s+1}-1}} \exp \left\{b \left( \sum_{s=0}^{s'} \sum_{h=1}^{2^s} \log\{R_{s,h} (1 - R_{s,h} )\} - \lambda\right) \right\},

where s' is the maximum occupied scale and B(p, q) is the Beta function. To sample from the latter distribution, a griddy Gibbs approach over the grid defined by hyper$hyperpar$gridB is used. See Ritter and Tanner (1992). From Version 1.1, if hyper$hyperpriors$base=TRUE and g0="normal" additional hyperpriors for the parameter of the centering normal density are assumed. Specifically the model is

y = \Phi(x; \mu, \sigma^2)

(\mu, \sigma^2) \sim N(\mu; \mu_0, \kappa_0\sigma^2)\mbox{I-Ga}(\sigma^2; \alpha_0, \beta_0)

and an addtional step simulating the values of \mu and \sigma^2 from their conditional posterior distribution is added to the Gibbs sampler of Canale and Dunson (2016). Specifically, a Metropolis-Hastings step with proposal equal to the prior is implemented.

Value

A list containing

density

A list containing postMeanDens, the posterior mean density estimate evaluated over xDens and postLowDens and postUppDens, the lower and upper pointwise 95% credible bands,

mcmc

A list containing the MCMC chains: dens is a matrix (nrep-nb) times n.grid, a and b are the vectors with the MCMC chains for the two parameters (if hyperprior was TRUE), scale is a matix where each column is a MCMC chain of the total mass for each scale, R and S, are matrices where each column in the tree2vec form of the corresponding trees, weights is a matrix where each column is the tree2vec form of the corresponding tree of weights, s and h are matrices where each column is the MCMC chain for the node labels for a subject.

postmean

A list containing posterior means over the MCMC samples of a, b, and of all binary trees

fit

A list containing the LPML, mean and median of the log CPO.

References

Canale, A. and Dunson, D. B. (2016), "Multiscale Bernstein polynomials for densities", Statistica Sinica, 26(3), 1175-1195.

Canale, A. (2017), "msBP: An R Package to Perform Bayesian Nonparametric Inference Using Multiscale Bernstein Polynomials Mixtures". Journal of Statistical Software, 78(6), 1-19.

Ritter C., Tanner M. (1992). "Facilitating the Gibbs Sampler: the Gibbs Stopper and the Griddy-Gibbs Sampler." Journal of the American Statistical Association, 87, 861-868.

See Also

msBP.postCluster

Examples

## Not run: 
data(galaxy)
galaxy <- data.frame(galaxy)
speeds <- galaxy$speed/1000
set.seed(1)
#with fixed g0 and random a, b
fit.msbp.1 <- msBP.Gibbs(speeds, a = 10, b = 5, g0 = "empirical", 
	mcmc=list(nrep = 10000, nb = 5000, ndisplay = 1000), 
	hyper=list(hyperprior=list(a = TRUE, b = TRUE, g0 = FALSE), 
	hyperpar=list(beta=5,gamma = 1,delta = 1,lambda = 1)), 
	printing = 0, maxS = 7, grid = list(n.points = 150, low = 5, upp = 38))

#with random a, b and hyperparameters of g0
fit.msbp.2 <- msBP.Gibbs(speeds, a = 10, b=5, g0 = "normal", 
	mcmc=list(nrep = 10000, nb = 5000, ndisplay = 1000), 
	hyper=list(hyperprior = list(a = TRUE, b = TRUE, g0 = TRUE), 
  hyperpar = list(beta = 50, gamma = 5, delta = 10, lambda = 1,
	gridB = seq(0, 20, length = 30),
	mu0 = 21, kappa0 = 0.1, alpha0 = 1, beta0 = 20)), 
	printing = 0, maxS = 7, grid = list(n.points = 150, lo w= 5, upp = 38))	

hist(speeds, prob=TRUE,br=10, ylim=c(0,0.23), main="", col='grey')
points(fit.msbp.1$density$postMeanDens~fit.msbp.1$density$xDens, ty='l', lwd=2)
points(fit.msbp.1$density$postUppDens~fit.msbp.1$density$xDens, ty='l',lty=2, lwd=2)
points(fit.msbp.1$density$postLowDens~fit.msbp.1$density$xDens, ty='l',lty=2, lwd=2)

hist(speeds, prob=TRUE,br=10, ylim=c(0,0.23), main="", col='grey')
points(fit.msbp.2$density$postMeanDens~fit.msbp.2$density$xDens, ty='l', lwd=2)
points(fit.msbp.2$density$postUppDens~fit.msbp.2$density$xDens, ty='l',lty=2, lwd=2)
points(fit.msbp.2$density$postLowDens~fit.msbp.2$density$xDens, ty='l',lty=2, lwd=2)


## End(Not run)

msBP documentation built on Aug. 23, 2023, 1:06 a.m.