dream_parallel: Differential Evolution Adaptive Metropolis (DREAM) algorithm
In tpilz/HydroBayes: Bayesian statistics for use in hydrological applications

Description Usage Arguments Details Value Note Author(s) References

Implementation of the DREAM algorithm, including variations (such as DREAM(zs) and DEAM(ABC)) depending on certain argument settings.

dream_parallel(fun, ..., lik = NULL, par.info = list(initial = NULL, min =
  NULL, max = NULL, mu = NULL, cov = NULL, val_ini = NULL, bound = NULL, names =
  NULL, prior = "flat"), nc, t, d, burnin = 0, adapt = 0.1,
  updateInterval = 10, delta = 3, c_val = 0.1, c_star = 1e-12,
  nCR = 3, p_g = 0.2, beta0 = 1, thin = 1, outlier_check = TRUE,
  obs = NULL, abc_rho = NULL, abc_e = NULL, glue_shape = NULL,
  lik_fun = NULL, past_sample = FALSE, m0 = NULL, archive_update = NULL,
  psnooker = 0, mt = 1, ncores = 1, checkConvergence = FALSE,
  verbose = TRUE)

`fun`	`character`. Name of a function(x, ...) which is evaluated at x being a d-dimensional parameter vector. Returns a likelihood, log-likelihood, simulation values or summary statistics depending on argument `lik`.
`...`	Additional arguments for `fun`.
`lik`	`integer`. Flag specifying a likelihood function to be used, see details.
`par.info`	A `list` of parameter information (see arguments `initial`, `min`, `max`, `mu`, `cov`, `val_ini`, `bound`, `names`, and `prior` below).
`nc`	`numeric`. Number of chains evolved in parallel.
`t`	`numeric`. Number of samples from the Markov chain.
`d`	`numeric`. Number of parameters.
`burnin`	`numeric`. Length of the burn-in period as portion of t (`burnin period = burnin * t`). These samples from the Markov chain will not be included in the output. Default: 0.
`adapt`	`numeric`. Length of the adaptation period as portion of t (`adaptation period = adapt * t`). Will be used for the update of crossover probabilities (NB: useful if some parameter are correlated as they will get a higher chance of being updated jointly) and the replacement of outlier chains (NB: this can enhance convergence to the target distribution). Default: 0.1.
`updateInterval`	`integer`. Interval for crossover probability updates during the adaptation period.
`delta`	`integer`. Maximum number of chain pairs used to generate the jump (default: 3).
`c_val`	`numeric`. Lambda value is sampled from U[-c_val,c_val] (default: 0.1).
`c_star`	`numeric`. Zeta value sampled from N[0,c_star]. Should be small compared to target (i.e. in this case the normal) distribution. Default: 1e-12.
`nCR`	`integer`. Length of vector with crossover probabilities for parameter subspace sampling (default: 3).
`p_g`	`numeric`. Probability for gamma, the jump rate, being equal to 1. Default: 0.2.
`beta0`	`numeric`. Reduce jump distance, e.g. if the average acceptance rate is low (less than 15 %). `0 < beta0 <= 1`. Default: 1 (i.e. jump distance is not adjusted).
`thin`	`integer`. Thinning to be applied to output in case of large `t`. See below.
`outlier_check`	`logical`. Shall outlier chains be identified and removed (only if `past_sample == FALSE`)? Default: `TRUE`.
`obs`	`numeric` vector of observations to be compared with output of `fun`. Only needed for some realisations of `lik` (see details).
`abc_rho`	`character`. Name of an ABC distance function(sim, obs) calculating the distance between simulated ('sim', output of `fun`) and observed (`obs`) diagnostic values. Needed for ABC methods of `lik`.
`abc_e`	`numeric` vector of length of `obs` specifying the ABC tolerance value (`lik = 22`) or representing the standard deviations for each summary statistics (e.g. streamflow signatures) returned by `fun` (`lik = 21`).
`glue_shape`	`numeric` scalar value used for GLUE-based informal likelihood functions (see details).
`lik_fun`	`character` specifying the name of a user-defined likelihood function(sim, obs) with sim being the output of `fun` and obs a vector of corresponding observations (argument `obs` above).
`past_sample`	`logical`. Shall proposal generation by sampling from past states be applied (i.e., DREAM(zs))? Default: `FALSE`.
`m0`	`integer`. The initial archive size (needed if `past_sample == TRUE`). Default: `NULL` (if `past_sample == TRUE` and `is.null(m0)` it will be set to `10*d`).
`archive_update`	`integer`. Update the archive by appending the current state of each Markov chain at every [`archive_update`]th iteration. Default: `NULL` (if `past_sample == TRUE` and `is.null(archive_update)` it will be set to 10).
`psnooker`	`numeric` value giving the probability for a snooker (instead of the conventional parallel direction) jump. Might be useful for complex target distributions to enhance the diversity of proposals. In such a case, a recommended value is 0.1. Otherwise, leave it at zero to prevent snooker updates (the default). NOTE: Can only be applied if `past_sample == TRUE`!
`mt`	`integer` specifying the number of sampling trials. A value greater one results in MT-DREAM. Default: 1.
`ncores`	`integer` specifying the number of CPU cores to be used. If > 1, packages `doMC` (Linux only!) and `parallel` are needed. Values > 1 only useful if `fun` is very complex and computational demanding, otherwise multiple thread handling will cause function slowdown! Default: 1.
`checkConvergence`	`logical`. Shall convergence of the MCMC chain be checked? Currently implemented: Calculating the Gelman-Rubin diagnostic. Takes a lot of time! Default: FALSE.
`verbose`	`logical`. Print progress bar to console? Default: TRUE.
`initial`	`character`. Method for prior sampling. One of: uniform - sampling from a uniform distribution; normal - a (multivariate) normal distribution; latin - latin hypercube sampling; user - value(s) given by the user.
`min`	`numeric`. A d-dimensional vector of minimum values for each parameter to sample from if `initial` is 'uniform' or 'latin'. Also defines bounding region for the proposal, see `bound`.
`max`	`numeric`. A d-dimensional vector of maximum values for each parameter to sample from if `initial` is 'uniform' or 'latin'. Also defines bounding region for the proposal, see `bound`.
`mu`	`numeric`. d-dimensional vector of parameter means if `initial` is 'normal'.
`cov`	`numeric`. d-by-d positive-definite symmetric matrix of parameter covariances if `initial` is 'normal'.
`val_ini`	`numeric` nc-by-d-dimensional matrix of prior values if `initial` is 'user'.
`bound`	`character`. What to do if the proposal parameter is outside the defined min-max limits. One of: bound - proposal is set to min/max value if it is smaller/larger than the defined limit. reflect - parameter value is reflected at the boundary towards the feasible space by the amount of boundary violation. fold - upper bound of each parameter dimension is connected to its respective lower bound (NOTE: this ensures detailed balance of the MCMC simulation but may lead to inflation of acceptance rates of prposals). NULL (default): nothing is done and proposals outside the feasible parameter range will be evaluated as well.
`names`	`character` vector of length d with names for the parameters. These can be used within `fun` (in this case, parameter input x of `fun` is a named vector) and will appear in the output list element 'chain'.
`prior`	`character` specifying the prior pdf, i.e. posterior ~ prior x likelihood. One of: 'flat' (non-informative prior, log-pdf set to zero and posterior ~ likelihood; DEFAULT); 'uniform' (prior is assumed to be uniformly distributed between `min` and `max`; multiplicative prior, parameters must not be correlated!); 'normal' (prior assumed to be normally distributed with `mu` and `cov`, i.e. parameters may be correlated); name of a user-defined function(x) with x being a d-dimensional parameter vector and returning the log-density of the d-variate prior distribution at x.

To understand the notation (e.g. what is lambda, nCR etc.), have a look at Sect. 3.3 of the reference paper (see below).

Likelihood options (argument lik):

1: fun returns the likelihood of parameter realisation x given some observations.

2: fun returns the log-likelihood.

21: ABC diagnostic model evaluation using a continuous fitness kernel, a variation of so-called noisy-ABC.

22: ABC diagnostic model evaluation using a Boxcar likelihood. fun has to return m diagnostic values to be compared with observations. Requires arguments obs, abc_rho, and abc_e. Modification in computation of Metropolis probability is used (Eq. 13 of Sadegh and Vrugt, 2014). Prior pdf set to zero!

31: GLUE with informal likelihood function based on the NSE: glue_shape * log(NSE). fun needs to return a time series of model simulations and obs should contain a time series of corresponding observations. glue_shape affects the sampling: small values result in high parameter uncertainty, large values produce narrow posterior pdfs of parameters. Should be in the range of 1-100 (experiment!).

99: A user-defined function, see argument lik_fun.

list with named elements:

chain: a (1-burnin)*t/thin-by-d+3-by-nc array of parameter realisations and the log-pdfs of the prior ('lp'), likelihood ('ll'), and posterior ('lpost') distribution for each iteration and Markov chain;

fx: a (1-burnin)*t/thin-by-nc-by-[length of fun's output] array of raw output of fun, corresponds with parameter realisations in chain;

runtime: time of function execution;

outlier: a list with adapt*t vectors of outlier indices in nc (value of 0 means no outliers);

AR: a [floor(t/updateInterval)+1]-by-2 matrix giving the total number of proposal evaluations an the associated acceptance rate averaged over the last updateInterval * nc evaluations.

CR: a [floor(t/updateInterval)+1]-by-[nCR+1] matrix giving the total number of proposal evaluations (first column) an the current selection probability for each of the nCR crossover values.

IF checkConvergence == TRUE:

R_stat: a (1-burnin)*t/thin-50+1-by-d matrix giving the Gelman-Rubin convergence diagnostic (note that at least 50 observations are used to compute R_stat).

If you want to use a non-implemented likelihood function with nuisance variables to be calibrated along with the actual model parameters, it is suggested to implement the likelihood calculation directly into fun.

Tobias Pilz tpilz@uni-potsdam.de

Code based on:

Vrugt, J. A.: "Markov chain Monte Carlo simulation using the DREAM software package: Theory, concepts, and MATLAB implementation." Environmental Modelling & Software, 2016, 75, 273 – 316, http://dx.doi.org/10.1016/j.envsoft.2015.08.013.

For ABC method see:

Sadegh, M. and J. A. Vrugt: "Approximate Bayesian computation using Markov Chain Monte Carlo simulation: DREAM_ABC". Water Resources Research, 2014, 50, 6767 – 6787, http://dx.doi.org/10.1002/2014WR015386.

For MT-DREAM(ZS) see:

Laloy, E. and J. A. Vrugt: "High-dimensional posterior exploration of hydrologic models using multiple-try DREAM(ZS) and high-performance computing". Water Resources Research, 2012, 48, W01526, http://dx.doi.org/10.1029/2011WR010608.

tpilz/HydroBayes documentation built on May 6, 2019, 3:44 p.m.