model-method-pathfinder | R Documentation |
The $pathfinder()
method of a CmdStanModel
object runs
Stan's Pathfinder algorithms. Pathfinder is a variational method for
approximately sampling from differentiable log densities. Starting from a
random initialization, Pathfinder locates normal approximations
to the target density along a quasi-Newton optimization path in
the unconstrained space, with local covariance estimated using
the negative inverse Hessian estimates produced by the LBFGS
optimizer. Pathfinder selects the normal approximation with the
lowest estimated Kullback-Leibler (KL) divergence to the true
posterior. Finally Pathfinder draws from that normal
approximation and returns the draws transformed to the
constrained scale. See the
CmdStan User’s Guide
for more details.
Any argument left as NULL
will default to the default value used by the
installed version of CmdStan
pathfinder(
data = NULL,
seed = NULL,
refresh = NULL,
init = NULL,
save_latent_dynamics = FALSE,
output_dir = getOption("cmdstanr_output_dir"),
output_basename = NULL,
sig_figs = NULL,
opencl_ids = NULL,
num_threads = NULL,
init_alpha = NULL,
tol_obj = NULL,
tol_rel_obj = NULL,
tol_grad = NULL,
tol_rel_grad = NULL,
tol_param = NULL,
history_size = NULL,
single_path_draws = NULL,
draws = NULL,
num_paths = 4,
max_lbfgs_iters = NULL,
num_elbo_draws = NULL,
save_single_paths = NULL,
psis_resample = NULL,
calculate_lp = NULL,
show_messages = TRUE,
show_exceptions = TRUE,
save_cmdstan_config = NULL
)
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
refresh |
(non-negative integer) The number of iterations between
printed screen updates. If |
init |
(multiple options) The initialization method to use for the variables declared in the parameters block of the Stan program. One of the following:
|
save_latent_dynamics |
(logical) Should auxiliary diagnostic information
about the latent dynamics be written to temporary diagnostic CSV files?
This argument replaces CmdStan's |
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
sig_figs |
(positive integer) The number of significant figures used
when storing the output values. By default, CmdStan represent the output
values with 6 significant figures. The upper limit for |
opencl_ids |
(integer vector of length 2) The platform and device IDs of
the OpenCL device to use for fitting. The model must be compiled with
|
num_threads |
(positive integer) If the model was
compiled with threading support, the number of
threads to use in parallelized sections (e.g., for multi-path pathfinder
as well as |
init_alpha |
(positive real) The initial step size parameter. |
tol_obj |
(positive real) Convergence tolerance on changes in objective function value. |
tol_rel_obj |
(positive real) Convergence tolerance on relative changes in objective function value. |
tol_grad |
(positive real) Convergence tolerance on the norm of the gradient. |
tol_rel_grad |
(positive real) Convergence tolerance on the relative norm of the gradient. |
tol_param |
(positive real) Convergence tolerance on changes in parameter value. |
history_size |
(positive integer) The size of the history used when approximating the Hessian. |
single_path_draws |
(positive integer) Number of draws a single
pathfinder should return. The number of draws PSIS sampling samples from
will be equal to |
draws |
(positive integer) Number of draws to return after performing
pareto smooted importance sampling (PSIS). This should be smaller than
|
num_paths |
(positive integer) Number of single pathfinders to run. |
max_lbfgs_iters |
(positive integer) The maximum number of iterations for LBFGS. |
num_elbo_draws |
(positive integer) Number of draws to make when calculating the ELBO of the approximation at each iteration of LBFGS. |
save_single_paths |
(logical) Whether to save the results of single pathfinder runs in multi-pathfinder. |
psis_resample |
(logical) Whether to perform pareto smoothed importance sampling.
If |
calculate_lp |
(logical) Whether to calculate the log probability of the draws.
If |
show_messages |
(logical) When |
show_exceptions |
(logical) When |
save_cmdstan_config |
(logical) When |
A CmdStanPathfinder
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run:
library(cmdstanr)
library(posterior)
library(bayesplot)
color_scheme_set("brightblue")
# Set path to CmdStan
# (Note: if you installed CmdStan via install_cmdstan() with default settings
# then setting the path is unnecessary but the default below should still work.
# Otherwise use the `path` argument to specify the location of your
# CmdStan installation.)
set_cmdstan_path(path = NULL)
# Create a CmdStanModel object from a Stan program,
# here using the example model that comes with CmdStan
file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
mod <- cmdstan_model(file)
mod$print()
# Print with line numbers. This can be set globally using the
# `cmdstanr_print_line_numbers` option.
mod$print(line_numbers = TRUE)
# Data as a named list (like RStan)
stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
# Run MCMC using the 'sample' method
fit_mcmc <- mod$sample(
data = stan_data,
seed = 123,
chains = 2,
parallel_chains = 2
)
# Use 'posterior' package for summaries
fit_mcmc$summary()
# Check sampling diagnostics
fit_mcmc$diagnostic_summary()
# Get posterior draws
draws <- fit_mcmc$draws()
print(draws)
# Convert to data frame using posterior::as_draws_df
as_draws_df(draws)
# Plot posterior using bayesplot (ggplot2)
mcmc_hist(fit_mcmc$draws("theta"))
# Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm)
# and also demonstrate specifying data as a path to a file instead of a list
my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json")
fit_optim <- mod$optimize(data = my_data_file, seed = 123)
fit_optim$summary()
# Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation
# to the posterior
fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE)
fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000)
fit_laplace$summary()
# Run 'variational' method to use ADVI to approximate posterior
fit_vb <- mod$variational(data = stan_data, seed = 123)
fit_vb$summary()
mcmc_hist(fit_vb$draws("theta"))
# Run 'pathfinder' method, a new alternative to the variational method
fit_pf <- mod$pathfinder(data = stan_data, seed = 123)
fit_pf$summary()
mcmc_hist(fit_pf$draws("theta"))
# Run 'pathfinder' again with more paths, fewer draws per path,
# better covariance approximation, and fewer LBFGSs iterations
fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40,
history_size=50, max_lbfgs_iters=100)
# Specifying initial values as a function
fit_mcmc_w_init_fun <- mod$sample(
data = stan_data,
seed = 123,
chains = 2,
refresh = 0,
init = function() list(theta = runif(1))
)
fit_mcmc_w_init_fun_2 <- mod$sample(
data = stan_data,
seed = 123,
chains = 2,
refresh = 0,
init = function(chain_id) {
# silly but demonstrates optional use of chain_id
list(theta = 1 / (chain_id + 1))
}
)
fit_mcmc_w_init_fun_2$init()
# Specifying initial values as a list of lists
fit_mcmc_w_init_list <- mod$sample(
data = stan_data,
seed = 123,
chains = 2,
refresh = 0,
init = list(
list(theta = 0.75), # chain 1
list(theta = 0.25) # chain 2
)
)
fit_optim_w_init_list <- mod$optimize(
data = stan_data,
seed = 123,
init = list(
list(theta = 0.75)
)
)
fit_optim_w_init_list$init()
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.