lmerBayes: A Metropolis MCMC version of lmer.

Description Usage Arguments Details Value Arguments description Examples

Description

A Metropolis MCMC version of lmer. A single independent variable, y, can be fit against any number of predictors, x, with one random effect. Like lmer, the model error can be binomial or Gaussian, but there are two alternatives for the Gaussian (described below). Relative to lmer, the key advantage offered is that y can be any function of the x. A second advantage is that the MCMC produces posterior distributions on every parameter, so full confidence limits are available. The principal limitation relative to lmer is that only one random effect is allowed. In addition, the Bayesian MCMC approach is quite a bit slower.

Usage

1
2
3
4
5
lmerBayes(data, ycol, randcol, xcol, start, fixef = NULL, startSD,
  startCov, model = logistic.standard, error = "Binom",
  includeCovar = TRUE, update = "conjugate", badparam = NULL,
  sdfunc = constant, badSDparam, paramfile = NULL, savestep = 500,
  steps = 1000, showstep = 100, burnin = 100, debug = FALSE, ...)

Arguments

data

The table of data, in lmer-style, including one column to be modeled (dependent variable, y), one or more predictors (independent variables, x), and one random effect, using any column names.

randcol

The name of one column holding the random variable; must be a character variable.

xcol, ycol

Chareacter string giving the name of a column in data that holds the x and y (numeric) variable.

start

A vector giving the starting set of parameters for the model. It must be as long as the number of parameters required by the model.

startSD

A single starting value for the residual standard deviation, only used with Gaussian and Negative Binomial error models.

startCov

Starting values of the diagonal of the covariance matrix; ignored if a full matrix of start parameters is submitted. Required even if covariance matrix is not fitted, because needed as starting hyperSD.

model

The function name holding the model describing y's relationship to all the x's, without quote marks. The first argument of the function must be named x, the second param, with additional arguments allowed. The model may accept as x either a vector or a matrix, the latter for a multiple regression. There can be any number of parameters, but the number must match the number given as start parameters. The return value must be a numeric vector with the same size as x.

error

A character variable with 6 possible values: "Binom", "NgBinom", "Pois", "Gauss", "GaussMultResid", or "Flat".

  • "Binom" uses binomial error for residuals - NegBinom'uses negative binomial error for residuals; the SD is then the dispersion parameter (k) of the negative binomial.

  • "Poisson" uses Poisson error for residuals.

  • "Gauss" uses Gaussian error for residuals with constant standard deviation across groups.

  • "GaussMultResid" uses Gaussian error for residuals, with standard deviation a constant fraction of the model's prediction (and thus only appropriate if predictions are strictly positive).

  • "Flat" is a trivial model where the same likelihood is returned regardless of parameters or data. It is for testing how parameter search behaves in absence of data, as for describing an implied prior.

includeCovar

TRUE or FALSE, whether to fit the full covariance matrix, vs. variances alone.

update

'conjugate' or 'metropolis', whether to use inverse-gamma (or inverse-Wishart for full covariance) vs. metropolis steps for updating covariances.

badparam

The name of a function (unquoted) that tests a set of model parameters for validity; must return TRUE if parameters are valid, otherwise FALSE.

sdfunc

The name of a function (unquoted) that models the residual standard deviation as a function of the x's, just like the model function. The default uses the function named constant, meaning the standard deviation is the same for all values of x. Parameters for this function are estimated, just as parameters for the model function are.

badSDparam

The name of a function which tests for invalid parameters for sdfunc, returning TRUE or FALSE (analogous to badparam); a simple version is provided, called badSD, which rejects a single parameter if it is < 0.

paramfile

The name of a file where the entire MCMC chain of parameter values is stored at regular intervals; when parameters are written to the file, they are erased from memory, thus removing the need for the entire chain of all parameters being stored at once while the model is running.

savestep

Parameters are appended to paramfile every savestep steps; must be < steps.

steps

The number of steps to run the Gibbs sampler.

showstep

Information is printed to the screen every showstep steps.

burnin

The number of steps to remove as burn-in before calculating posterior distributions; not that all parameters are saved and returned regardless.

debug

Logical. If TRUE, call browser to debug.

...

The typical R means for submitting additional parameters for various functions used in the model (model, sdfunc, badparam, badSDparam).

Details

Data are submitted the way lm or lmer require, with one single table, one row per observation; the random effects are in one column. The formula, however, is not submitted using the R-style 'squiggle'~. Rather, the names of x, y, and random columns, are given. The model describing y's function of the x's is passed, and must be provided by the user (several are available within the CTFS R Package, though, in the Utilities topic). Examples below will serve to explain.

As in lmer, all parameters of the model follow a Gaussian hyperdistribution across the random effects. There is an option to include a full covariance matrix as the hyperdistribution, otherwise, only the variances are fit (ie, the covariance matrix has only zeroes off-diagonal). There is also an option to use the conjugate inverse-gamma or inverse-wishart for the variances and covariances; otherwise, Metropolis steps are used.

A starting set of parameters for the model must be submitted. It can be a vector as long as the number of parameters required by the model, or it can be a full matrix, with one row of parameters for each of the random effects. The latter requires knowing in advance the names of all the random effects.

There is a further complication included whose purpose is reducing memory demand in big models with many MCMC steps. option paramfile allows the full parameter matrix to be written into a text file every savestep steps, then erased from memory.

This is to reduce memory needs. The function summaryMCMC restores the parameters from the text file into an giant R array.

Further details are given in the description of all the arguments and the sample here, plus a tutorial on Mortality changes offers a worked example.

Value

A list with several components:

Arguments description

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
## Not run: 
# Assume two plot datasets from BCI are available, bciex::bci12t6mini and
bciex::bci12t7mini.
# Subset to trees above 10 cm dbh and just 10 species for illustration (the
# model will run much faster). The fixed effect, species - level variation
# (or error), and the model parameters for each species are shown below. 
# Check the names of the result to see what else lmerBayes returns.

gtable = growth.indiv(bciex::bci12t6mini, bciex::bci12t7mini, mindbh = 100)
a_few_species = c(
  'termam',
  'tachve',
  'pri2co',
  'gustsu',
  'cecrin',
  'tet2pa',
  'guatdu',
  'vochfe',
  'virose',
  'maquco'
)
gtable = subset(gtable, !is.na(incgr) & sp %in% a_few_species)
mod = lmerBayes(
  data = gtable,
  ycol = 'incgr',
  xcol = 'dbh1',
  randcol = 'sp',
  start = c(1, 0),
  startSD = 1,
  startCov = 1,
  model = linear.model,
  error = 'Gauss',
  includeCovar = FALSE,
  badSDparam = badSD,
  steps = 1100,
  showstep = 50,
  burnin = 100
)
mod$bestmu
diag(sqrt(mod$bestsigma))
mod$best
names(mod)

## End(Not run)

forestgeo/ctfs documentation built on May 3, 2019, 6:44 p.m.