Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples
View source: R/bridge_sampler.R
Computes log marginal likelihood via bridge sampling.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | bridge_sampler(samples, ...)
## S3 method for class 'stanfit'
bridge_sampler(
samples = NULL,
stanfit_model = samples,
repetitions = 1,
method = "normal",
cores = 1,
use_neff = TRUE,
maxiter = 1000,
silent = FALSE,
verbose = FALSE,
...
)
## S3 method for class 'mcmc.list'
bridge_sampler(
samples = NULL,
log_posterior = NULL,
...,
data = NULL,
lb = NULL,
ub = NULL,
repetitions = 1,
param_types = rep("real", ncol(samples[[1]])),
method = "normal",
cores = 1,
use_neff = TRUE,
packages = NULL,
varlist = NULL,
envir = .GlobalEnv,
rcppFile = NULL,
maxiter = 1000,
silent = FALSE,
verbose = FALSE
)
## S3 method for class 'mcmc'
bridge_sampler(
samples = NULL,
log_posterior = NULL,
...,
data = NULL,
lb = NULL,
ub = NULL,
repetitions = 1,
method = "normal",
cores = 1,
use_neff = TRUE,
packages = NULL,
varlist = NULL,
envir = .GlobalEnv,
rcppFile = NULL,
maxiter = 1000,
param_types = rep("real", ncol(samples)),
silent = FALSE,
verbose = FALSE
)
## S3 method for class 'matrix'
bridge_sampler(
samples = NULL,
log_posterior = NULL,
...,
data = NULL,
lb = NULL,
ub = NULL,
repetitions = 1,
method = "normal",
cores = 1,
use_neff = TRUE,
packages = NULL,
varlist = NULL,
envir = .GlobalEnv,
rcppFile = NULL,
maxiter = 1000,
param_types = rep("real", ncol(samples)),
silent = FALSE,
verbose = FALSE
)
## S3 method for class 'stanreg'
bridge_sampler(
samples,
repetitions = 1,
method = "normal",
cores = 1,
use_neff = TRUE,
maxiter = 1000,
silent = FALSE,
verbose = FALSE,
...
)
## S3 method for class 'rjags'
bridge_sampler(
samples = NULL,
log_posterior = NULL,
...,
data = NULL,
lb = NULL,
ub = NULL,
repetitions = 1,
method = "normal",
cores = 1,
use_neff = TRUE,
packages = NULL,
varlist = NULL,
envir = .GlobalEnv,
rcppFile = NULL,
maxiter = 1000,
silent = FALSE,
verbose = FALSE
)
## S3 method for class 'runjags'
bridge_sampler(
samples = NULL,
log_posterior = NULL,
...,
data = NULL,
lb = NULL,
ub = NULL,
repetitions = 1,
method = "normal",
cores = 1,
use_neff = TRUE,
packages = NULL,
varlist = NULL,
envir = .GlobalEnv,
rcppFile = NULL,
maxiter = 1000,
silent = FALSE,
verbose = FALSE
)
## S3 method for class 'MCMC_refClass'
bridge_sampler(
samples,
repetitions = 1,
method = "normal",
cores = 1,
use_neff = TRUE,
maxiter = 1000,
silent = FALSE,
verbose = FALSE,
...
)
|
samples |
an |
... |
additional arguments passed to |
stanfit_model |
for the |
repetitions |
number of repetitions. |
method |
either |
cores |
number of cores used for evaluating |
use_neff |
Boolean which determines whether the effective sample size is
used in the optimal bridge function. Default is TRUE. If FALSE, the number
of samples is used instead. If |
maxiter |
maximum number of iterations for the iterative updating scheme. Default is 1,000 to avoid infinite loops. |
silent |
Boolean which determines whether to print the number of iterations of the updating scheme to the console. Default is FALSE. |
verbose |
Boolean. Should internal debug information be printed to
console? Default is |
log_posterior |
function or name of function that takes a parameter
vector and the |
data |
data object which is used in |
lb |
named vector with lower bounds for parameters. |
ub |
named vector with upper bounds for parameters. |
param_types |
character vector of length |
packages |
character vector with names of packages needed for evaluating
|
varlist |
character vector with names of variables needed for evaluating
|
envir |
specifies the environment for |
rcppFile |
in case |
Bridge sampling is implemented as described in Meng and Wong (1996,
see equation 4.1) using the "optimal" bridge function. When method =
"normal"
, the proposal distribution is a multivariate normal distribution
with mean vector equal to the sample mean vector of samples
and
covariance matrix equal to the sample covariance matrix of samples
.
For a recent tutorial on bridge sampling, see Gronau et al. (in press).
When method = "warp3"
, the proposal distribution is a standard
multivariate normal distribution and the posterior distribution is "warped"
(Meng & Schilling, 2002) so that it has the same mean vector, covariance
matrix, and skew as the samples. method = "warp3"
takes approximately
twice as long as method = "normal"
.
Note that for the matrix
method, the lower and upper bound of a
parameter cannot be a function of the bounds of another parameter.
Furthermore, constraints that depend on multiple parameters of the model are
not supported. This usually excludes, for example, parameters that
constitute a covariance matrix or sets of parameters that need to sum to
one.
However, if the retransformations are part of the model itself and the
log_posterior
accepts parameters on the real line and performs the
appropriate Jacobian adjustments, such as done for stanfit
and
stanreg
objects, such constraints are obviously possible (i.e., we
currently do not know of any parameter supported within Stan that does not
work with the current implementation through a stanfit
object).
On unix-like systems forking is used via
mclapply
. Hence elements needed for evaluation of
log_posterior
should be in the .GlobalEnv
.
On other OSes (e.g., Windows), things can get more complicated. For normal
parallel computation, the log_posterior
function can be passed as
both function and function name. If the latter, it needs to exist in the
environment specified in the envir
argument. For parallel computation
when using an Rcpp
function, log_posterior
can only be passed
as the function name (i.e., character). This function needs to result from
calling sourceCpp
on the file specified in rcppFile
.
Due to the way rstan
currently works, parallel computations with
stanfit
and stanreg
objects only work with forking (i.e., NOT
on Windows).
if repetitions = 1
, returns a list of class "bridge"
with components:
logml
: estimate of log marginal
likelihood.
niter
: number of iterations of the iterative
updating scheme.
method
: bridge sampling method that was used
to obtain the estimate.
q11
: log posterior evaluations for
posterior samples.
q12
: log proposal evaluations for posterior
samples.
q21
: log posterior evaluations for samples from
proposal.
q22
: log proposal evaluations for samples from
proposal.
if repetitions > 1
, returns a list of class
"bridge_list"
with components:
logml
: numeric
vector with estimates of log marginal likelihood.
niter
:
numeric vector with number of iterations of the iterative updating scheme
for each repetition.
method
: bridge sampling method that was
used to obtain the estimates.
repetitions
: number of
repetitions.
Note that the results depend strongly on the parameter priors. Therefore, it is strongly advised to think carefully about the priors before calculating marginal likelihoods. For example, the prior choices implemented in rstanarm or brms might not be optimal from a testing point of view. We recommend to use priors that have been chosen from a testing and not a purely estimation perspective.
Also note that for testing, the number of posterior samples usually needs to be substantially larger than for estimation.
To be able to use a stanreg
object for samples
, the user
crucially needs to have specified the diagnostic_file
when fitting
the model in rstanarm.
Quentin F. Gronau and Henrik Singmann. Parallel computing (i.e.,
cores > 1
) and the stanfit
method use code from rstan
by Jiaqing Guo, Jonah Gabry, and Ben Goodrich. Ben Goodrich added the
stanreg
method. Kees Mulder added methods for simplex and circular
variables.
Gronau, Q. F., Singmann, H., & Wagenmakers, E.-J. (2020). bridgesampling: An R Package for Estimating Normalizing Constants. Journal of Statistical Software, 92. doi: 10.18637/jss.v092.i10
Gronau, Q. F., Sarafoglou, A., Matzke, D., Ly, A., Boehm, U.,
Marsman, M., Leslie, D. S., Forster, J. J., Wagenmakers, E.-J., &
Steingroever, H. (in press). A tutorial on bridge sampling. Journal of
Mathematical Psychology. https://arxiv.org/abs/1703.05984
vignette("bridgesampling_tutorial")
Gronau, Q. F., Wagenmakers, E.-J., Heck, D. W., & Matzke, D. (2017). A simple method for comparing complex models: Bayesian model comparison for hierarchical multinomial processing tree models using Warp-III bridge sampling. Manuscript submitted for publication. https://psyarxiv.com/yxhfm
Meng, X.-L., & Wong, W. H. (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration. Statistica Sinica, 6, 831-860. http://www3.stat.sinica.edu.tw/statistica/j6n4/j6n43/j6n43.htm
Meng, X.-L., & Schilling, S. (2002). Warp bridge sampling. Journal of Computational and Graphical Statistics, 11(3), 552-586. doi: 10.1198/106186002457
Overstall, A. M., & Forster, J. J. (2010). Default Bayesian model determination methods for generalised linear mixed models. Computational Statistics & Data Analysis, 54, 3269-3288. doi: 10.1016/j.csda.2010.03.008
bf
allows the user to calculate Bayes factors and
post_prob
allows the user to calculate posterior model
probabilities from bridge sampling estimates. bridge-methods
lists some additional methods that automatically invoke the
error_measures
function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | ## ------------------------------------------------------------------------
## Example 1: Estimating the Normalizing Constant of a Two-Dimensional
## Standard Normal Distribution
## ------------------------------------------------------------------------
library(bridgesampling)
library(mvtnorm)
samples <- rmvnorm(1e4, mean = rep(0, 2), sigma = diag(2))
colnames(samples) <- c("x1", "x2")
log_density <- function(samples.row, data) {
-.5*t(samples.row) %*% samples.row
}
lb <- rep(-Inf, 2)
ub <- rep(Inf, 2)
names(lb) <- names(ub) <- colnames(samples)
bridge_result <- bridge_sampler(samples = samples, log_posterior = log_density,
data = NULL, lb = lb, ub = ub, silent = TRUE)
# compare to analytical value
analytical <- log(2*pi)
print(cbind(bridge_result$logml, analytical))
## Not run:
## ------------------------------------------------------------------------
## Example 2: Hierarchical Normal Model
## ------------------------------------------------------------------------
# for a full description of the example, see
vignette("bridgesampling_example_jags")
library(R2jags)
### generate data ###
set.seed(12345)
mu <- 0
tau2 <- 0.5
sigma2 <- 1
n <- 20
theta <- rnorm(n, mu, sqrt(tau2))
y <- rnorm(n, theta, sqrt(sigma2))
### set prior parameters
alpha <- 1
beta <- 1
mu0 <- 0
tau20 <- 1
### functions to get posterior samples ###
### H0: mu = 0
getSamplesModelH0 <- function(data, niter = 52000, nburnin = 2000, nchains = 3) {
model <- "
model {
for (i in 1:n) {
theta[i] ~ dnorm(0, invTau2)
y[i] ~ dnorm(theta[i], 1/sigma2)
}
invTau2 ~ dgamma(alpha, beta)
tau2 <- 1/invTau2
}"
s <- jags(data, parameters.to.save = c("theta", "invTau2"),
model.file = textConnection(model),
n.chains = nchains, n.iter = niter,
n.burnin = nburnin, n.thin = 1)
return(s)
}
### H1: mu != 0
getSamplesModelH1 <- function(data, niter = 52000, nburnin = 2000,
nchains = 3) {
model <- "
model {
for (i in 1:n) {
theta[i] ~ dnorm(mu, invTau2)
y[i] ~ dnorm(theta[i], 1/sigma2)
}
mu ~ dnorm(mu0, 1/tau20)
invTau2 ~ dgamma(alpha, beta)
tau2 <- 1/invTau2
}"
s <- jags(data, parameters.to.save = c("theta", "mu", "invTau2"),
model.file = textConnection(model),
n.chains = nchains, n.iter = niter,
n.burnin = nburnin, n.thin = 1)
return(s)
}
### get posterior samples ###
# create data lists for Jags
data_H0 <- list(y = y, n = length(y), alpha = alpha, beta = beta, sigma2 = sigma2)
data_H1 <- list(y = y, n = length(y), mu0 = mu0, tau20 = tau20, alpha = alpha,
beta = beta, sigma2 = sigma2)
# fit models
samples_H0 <- getSamplesModelH0(data_H0)
samples_H1 <- getSamplesModelH1(data_H1)
### functions for evaluating the unnormalized posteriors on log scale ###
log_posterior_H0 <- function(samples.row, data) {
mu <- 0
invTau2 <- samples.row[[ "invTau2" ]]
theta <- samples.row[ paste0("theta[", seq_along(data$y), "]") ]
sum(dnorm(data$y, theta, data$sigma2, log = TRUE)) +
sum(dnorm(theta, mu, 1/sqrt(invTau2), log = TRUE)) +
dgamma(invTau2, data$alpha, data$beta, log = TRUE)
}
log_posterior_H1 <- function(samples.row, data) {
mu <- samples.row[[ "mu" ]]
invTau2 <- samples.row[[ "invTau2" ]]
theta <- samples.row[ paste0("theta[", seq_along(data$y), "]") ]
sum(dnorm(data$y, theta, data$sigma2, log = TRUE)) +
sum(dnorm(theta, mu, 1/sqrt(invTau2), log = TRUE)) +
dnorm(mu, data$mu0, sqrt(data$tau20), log = TRUE) +
dgamma(invTau2, data$alpha, data$beta, log = TRUE)
}
# specify parameter bounds H0
cn <- colnames(samples_H0$BUGSoutput$sims.matrix)
cn <- cn[cn != "deviance"]
lb_H0 <- rep(-Inf, length(cn))
ub_H0 <- rep(Inf, length(cn))
names(lb_H0) <- names(ub_H0) <- cn
lb_H0[[ "invTau2" ]] <- 0
# specify parameter bounds H1
cn <- colnames(samples_H1$BUGSoutput$sims.matrix)
cn <- cn[cn != "deviance"]
lb_H1 <- rep(-Inf, length(cn))
ub_H1 <- rep(Inf, length(cn))
names(lb_H1) <- names(ub_H1) <- cn
lb_H1[[ "invTau2" ]] <- 0
# compute log marginal likelihood via bridge sampling for H0
H0.bridge <- bridge_sampler(samples = samples_H0, data = data_H0,
log_posterior = log_posterior_H0, lb = lb_H0,
ub = ub_H0, silent = TRUE)
print(H0.bridge)
# compute log marginal likelihood via bridge sampling for H1
H1.bridge <- bridge_sampler(samples = samples_H1, data = data_H1,
log_posterior = log_posterior_H1, lb = lb_H1,
ub = ub_H1, silent = TRUE)
print(H1.bridge)
# compute percentage error
print(error_measures(H0.bridge)$percentage)
print(error_measures(H1.bridge)$percentage)
# compute Bayes factor
BF01 <- bf(H0.bridge, H1.bridge)
print(BF01)
# compute posterior model probabilities (assuming equal prior model probabilities)
post1 <- post_prob(H0.bridge, H1.bridge)
print(post1)
# compute posterior model probabilities (using user-specified prior model probabilities)
post2 <- post_prob(H0.bridge, H1.bridge, prior_prob = c(.6, .4))
print(post2)
## End(Not run)
## Not run:
## ------------------------------------------------------------------------
## Example 3: rstanarm
## ------------------------------------------------------------------------
library(rstanarm)
# N.B.: remember to specify the diagnostic_file
fit_1 <- stan_glm(mpg ~ wt + qsec + am, data = mtcars,
chains = 2, cores = 2, iter = 5000,
diagnostic_file = file.path(tempdir(), "df.csv"))
bridge_1 <- bridge_sampler(fit_1)
fit_2 <- update(fit_1, formula = . ~ . + cyl)
bridge_2 <- bridge_sampler(fit_2, method = "warp3")
bf(bridge_1, bridge_2)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.