Carry out statistical inference on greta models by MCMC or likelihood/posterior optimisation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
mcmc(model, sampler = hmc(), n_samples = 1000, thin = 1, warmup = 1000, chains = 4, n_cores = NULL, verbose = TRUE, pb_update = 50, one_by_one = FALSE, initial_values = initials()) stashed_samples() extra_samples(draws, n_samples = 1000, thin = 1, n_cores = NULL, verbose = TRUE, pb_update = 50, one_by_one = FALSE) initials(...) opt(model, optimiser = bfgs(), max_iterations = 100, tolerance = 1e-06, initial_values = initials(), adjust = TRUE, hessian = FALSE)
sampler used to draw values in MCMC. See
number of MCMC samples to draw per chain (after any warm-up, but before thinning)
MCMC thinning rate; every
number of samples to spend warming up the mcmc sampler (moving chains toward the highest density area and tuning sampler hyperparameters).
number of MCMC chains to run
the maximum number of CPU cores used by each sampler (see details).
whether to print progress information to the console
how regularly to update the progress bar (in iterations).
whether to run TensorFlow MCMC code one iteration at a time, so that greta can handle numerical errors as 'bad' proposals (see below).
an mcmc.list object returned by
named numeric values, giving initial values of some or all of the variables in the model (unnamed variables will be automatically initialised)
the maximum number of iterations before giving up
the numerical tolerance for the solution, the optimiser stops when the (absolute) difference in the joint density between successive iterations drops below this level
whether to account for Jacobian adjustments in the joint
density. Set to
whether to return a list of analytically differentiated Hessian arrays for the parameters
verbose = TRUE, the progress bar shows
the number of iterations so far and the expected time to complete the phase
of model fitting (warmup or sampling). Occasionally, a proposed set of
parameters can cause numerical instability (I.e. the log density or its
-Inf); normally because the log
joint density is so low that it can't be represented as a floating point
number. When this happens, the progress bar will also display the
proportion of proposals so far that were 'bad' (numerically unstable) and
therefore rejected. Some numerical instability during the warmup phase is
normal, but 'bad' samples during the sampling phase can lead to bias in
your posterior sample. If you only have a few bad samples (<10\
usually resolve this with a longer warmup period or by manually defining
starting values to move the sampler into a more reasonable part of the
parameter space. If you have more samples than that, it may be that your
model is misspecified. You can often diagnose this by using
calculate() to evaluate the values of greta arrays, given
fixed values of model parameters, and checking the results are what you
greta runs multiple chains simultaneously with a single sampler,
vectorising all operations across the chains. E.g. a scalar addition in
your model is computed as an elementwise vector addition (with vectors
chains), a vector addition is computed as a matrix
addition etc. TensorFlow is able to parallelise these operations, and this
approach reduced computational overheads, so this is the most efficient of
computing on multiple chains.
Multiple mcmc samplers (each of which can simultaneously run multiple
chains) can also be run in parallel by setting the execution plan with the
future package. Only
plan(multisession) futures or
plan(cluster) futures that don't use fork clusters are allowed,
since forked processes conflict with TensorFlow's parallelism. Explicitly
parallelising chains on a local machine with
probably be slower than running multiple chains simultaneously in a single
plan(sequential), the default) because of the overhead
required to start new sessions. However,
plan(cluster) can be used
to run chains on a cluster of machines on a local or remote network. See
future::cluster for details, and the
future.batchtools package to set up plans on clusters with job
n_cores = NULL and mcmc samplers are being run sequentially, each
sampler will be allowed to use all CPU cores (possibly to compute multiple
chains sequentially). If samplers are being run in parallel with the
n_cores will be set so that
future::nbrOfWorkers is less than the number
of CPU cores.
If the sampler is aborted before finishing (and
parallelism isn't being used), the samples collected so far can be
stashed_samples(). Only samples from the sampling
phase will be returned.
Samples returned by
be added to with
extra_samples(). This continues the chain from the
last value of the previous chain and uses the same sampler and model as was
used to generate the previous samples. It is not possible to change the
sampler or extend the warmup period.
opt() acts on a list of greta arrays with possibly
varying dimension, the
hessian objects returned by
opt() are named lists, rather than a vector (
par) and a
hessian), as returned by
Because greta arrays may not be vectors, the Hessians may not be matrices,
but could be higher-dimensional arrays. To return a Hessian matrix covering
multiple model parameters, you can construct your model so that all those
parameters are in a vector, then split the vector up to define the model.
The parameter vector can then be passed to model. See example.
extra_samples - an
mcmc.list object that can be analysed using functions from the coda
package. This will contain mcmc samples of the greta arrays used to create
opt - a list containing the following named elements:
par a named list of the optimal values for the greta arrays
value the (unadjusted) negative log joint density of the
model at the parameters 'par'
iterations the number of iterations taken by the optimiser
convergence an integer code, 0 indicates successful
completion, 1 indicates the iteration limit
max_iterations had been
hessian = TRUE) a named list of hessian
matrices/arrays for the parameters (w.r.t.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
## Not run: # define a simple Bayesian model x <- rnorm(10) mu <- normal(0, 5) sigma <- lognormal(1, 0.1) distribution(x) <- normal(mu, sigma) m <- model(mu, sigma) # carry out mcmc on the model draws <- mcmc(m, n_samples = 100) # add some more samples draws <- extra_samples(draws, 200) #' # initial values can be passed for some or all model variables draws <- mcmc(m, chains = 1, initial_values = initials(mu = -1)) # if there are multiple chains, a list of initial values should be passed, # othewise the same initial values will be used for all chains inits <- list(initials(sigma = 0.5), initials(sigma = 1)) draws <- mcmc(m, chains = 2, initial_values = inits) # you can auto-generate a list of initials with something like this: inits <- replicate(4, initials(mu = rnorm(1), sigma = runif(1)), simplify = FALSE) draws <- mcmc(m, chains = 4, initial_values = inits) # or find the MAP estimate opt_res <- opt(m) # get the MLE of the normal variance mu <- variable() variance <- variable(lower = 0) distribution(x) <- normal(mu, sqrt(variance)) m2 <- model(variance) # adjust = FALSE skips the jacobian adjustments used in MAP estimation, to # give the true maximum likelihood estimates o <- opt(m2, adjust = FALSE) # the MLE corresponds to the *unadjusted* sample variance, but differs # from the sample variance o$par mean((x - mean(x)) ^ 2) # same var(x) # different # initial values can also be passed to optimisers: o <- opt(m2, initial_values = initials(variance = 1)) # and you can return a list of the Hessians for each of these parameters o <- opt(m2, hessians = TRUE) o$hessians # to get a hessian matrix across multiple greta arrays, you must first # combine them and then split them up for use in the model (so that the # combined vector is part of the model) and pass that vector to model: params <- c(variable(), variable(lower = 0)) mu <- params variance <- params distribution(x) <- normal(mu, sqrt(variance)) m3 <- model(params) o <- opt(m3, hessians = TRUE) o$hessians ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.