GMCMC: General Markov Chain Monte Carlo for BAMLSS

View source: R/samplers.R

sam_GMCMCR Documentation

General Markov Chain Monte Carlo for BAMLSS

Description

These functions provide a quite general infrastructure for sampling BAMLSS. The default proposal function is based on iteratively weighted least squares (IWLS), however, each model term may have a different updating function, see the details.

Usage

## Sampler functions:
sam_GMCMC(x, y, family, start = NULL, weights = NULL, offset = NULL,
  n.iter = 1200, burnin = 200, thin = 1, verbose = TRUE,
  step = 20, propose = "iwlsC_gp", chains = NULL, ...)

GMCMC(x, y, family, start = NULL, weights = NULL, offset = NULL,
  n.iter = 1200, burnin = 200, thin = 1, verbose = TRUE,
  step = 20, propose = "iwlsC_gp", chains = NULL, ...)

## Propose functions:
GMCMC_iwls(family, theta, id, eta, y, data,
  weights = NULL, offset = NULL, ...)
GMCMC_iwlsC(family, theta, id, eta, y, data,
  weights = NULL, offset = NULL, zworking, resids, rho, ...)
GMCMC_iwlsC_gp(family, theta, id, eta, y, data,
  weights = NULL, offset = NULL, zworking, resids, rho, ...)
GMCMC_slice(family, theta, id, eta, y, data, ...)

Arguments

x

For function bfit() the x list, as returned from function bamlss.frame, holding all model matrices and other information that is used for fitting the model. For the updating functions an object as returned from function smooth.construct or smoothCon.

y

The model response, as returned from function bamlss.frame.

family

A bamlss family object, see family.bamlss.

start

A named numeric vector containing possible starting values, the names are based on function parameters.

weights

Prior weights on the data, as returned from function bamlss.frame.

offset

Can be used to supply model offsets for use in fitting, returned from function bamlss.frame.

n.iter

Sets the number of MCMC iterations.

burnin

Sets the burn-in phase of the sampler, i.e., the number of starting samples that should be removed.

thin

Defines the thinning parameter for MCMC simulation. E.g., thin = 10 means, that only every 10th sampled parameter will be stored.

verbose

Print information during runtime of the algorithm.

step

How many times should algorithm runtime information be printed, divides n.iter.

propose

Sets the propose function for model terms, e.g. for a term s(x) in the model formula. Per default this is set to "iwlsC", a character pointing to the set of propose functions, see above. Other options are "iwls" and "slice", however, this is more experimental and should not be set by the user. Another option is to pass a full propose function which should be used for each model term, the structure of propose functions is described in the details below. Model terms may also have different propose functions, see the example section.

chains

How many chains should be started? Chains a sampled sequentially!

theta

The current state of parameters, provided as a named list. The first level represents the parameters of the distribution, the second level the parameters of the model terms. E.g., using the gaussian_bamlss family object theta[["mu"]][["s(x)"]] extracts the current state of a model term "s(x)" of the "mu" parameter. Extraction is done with the id argument.

id

The parameter identifier, a character vector of length 2. The first character specifies the current distributional parameter, the second the current model term.

eta

The current value of the predictors, provided as a named list, one list entry for each parameter. The names correspond to the parameter names in the family object, see family.bamlss. E.g., when using the gaussian_bamlss family object, the current values for the mean can be extracted by eta\$mu and for the standard deviation by eta\$sigma.

data

An object as returned from function smooth.construct or smoothCon. The object is preprocessed by function bamlss.engine.setup.

zworking

Preinitialized numeric vector of length(y), only for internal usage.

resids

Preinitialized numeric vector of length(y), only for internal usage.

rho

An environment, only for internal usage.

...

Arguments passed to function bamlss.engine.setup and to the propose functions.

Details

The sampler function sam_GMCMC() cycles through all distributional parameters and corresponding model terms in each iteration of the MCMC chain. Samples of the parameters of a model term (e.g., s(x)) are generated by proposal functions, e.g. GMCMC_iwls().

The default proposal function that should be used for all model terms is set with argument propose. For smooth terms, e.g. terms created with function s, if a valid propose function is supplied within the extra xt list, this propose function will be used. This way each model term may have its own propose function for creating samples of the parameters. See the example section.

The default proposal function GMCMC_iwlsC_gp allows for general priors for the smoothing variances and general penalty functions. Samples of smoothing variances are computed using slice sampling. Function GMCMC_iwlsC samples smoothing variances of univariate terms assuming an inverse gamma prior. Terms of higher dimensions use again slice sampling for creating samples of smoothing variances.

Function GMCMC_iwls is similar to function GMCMC_iwlsC but uses plain R code.

Function GMCMC_slice applies slice sampling also for the regression coefficients and is therefore relatively slow.

Value

The function returns samples of parameters, depending on the return value of the propose functions other quantities can be returned. The samples are provided as a mcmc matrix. If chains > 1, the samples are provided as a mcmc.list.

References

Umlauf N, Klein N, Zeileis A (2016). Bayesian Additive Models for Location Scale and Shape (and Beyond). (to appear)

See Also

bamlss, bamlss.frame, bamlss.engine.setup, set.starting.values, s2

Examples

## Not run: ## Simulated data example illustrating
## how to call the sampler function.
## This is done internally within
## the setup of function bamlss().
d <- GAMart()
f <- num ~ s(x1, bs = "ps")
bf <- bamlss.frame(f, data = d, family = "gaussian")

## First, find starting values with optimizer.
opt <- with(bf, bfit(x, y, family))

## Sample.
samps <- with(bf, sam_GMCMC(x, y, family, start = opt$parameters))
plot(samps)

## End(Not run)

bamlss documentation built on Oct. 11, 2024, 5:07 p.m.