lmerBayes: A Metropolis MCMC version of lmer.
In forestgeo/ctfs: Tools for the Analysis of Forest Dynamics

Description Usage Arguments Details Value Arguments description Examples

A Metropolis MCMC version of lmer. A single independent variable, y, can be fit against any number of predictors, x, with one random effect. Like lmer, the model error can be binomial or Gaussian, but there are two alternatives for the Gaussian (described below). Relative to lmer, the key advantage offered is that y can be any function of the x. A second advantage is that the MCMC produces posterior distributions on every parameter, so full confidence limits are available. The principal limitation relative to lmer is that only one random effect is allowed. In addition, the Bayesian MCMC approach is quite a bit slower.

lmerBayes(data, ycol, randcol, xcol, start, fixef = NULL, startSD,
  startCov, model = logistic.standard, error = "Binom",
  includeCovar = TRUE, update = "conjugate", badparam = NULL,
  sdfunc = constant, badSDparam, paramfile = NULL, savestep = 500,
  steps = 1000, showstep = 100, burnin = 100, debug = FALSE, ...)

`data`	The table of data, in lmer-style, including one column to be modeled (dependent variable, y), one or more predictors (independent variables, x), and one random effect, using any column names.
`randcol`	The name of one column holding the random variable; must be a character variable.
`xcol, ycol`	Chareacter string giving the name of a column in `data` that holds the x and y (numeric) variable.
`start`	A vector giving the starting set of parameters for the model. It must be as long as the number of parameters required by the model.
`startSD`	A single starting value for the residual standard deviation, only used with Gaussian and Negative Binomial error models.
`startCov`	Starting values of the diagonal of the covariance matrix; ignored if a full matrix of start parameters is submitted. Required even if covariance matrix is not fitted, because needed as starting hyperSD.
`model`	The function name holding the model describing y's relationship to all the x's, without quote marks. The first argument of the function must be named x, the second param, with additional arguments allowed. The model may accept as x either a vector or a matrix, the latter for a multiple regression. There can be any number of parameters, but the number must match the number given as start parameters. The return value must be a numeric vector with the same size as x.
`error`	A character variable with 6 possible values: "Binom", "NgBinom", "Pois", "Gauss", "GaussMultResid", or "Flat". "Binom" uses binomial error for residuals - NegBinom'uses negative binomial error for residuals; the SD is then the dispersion parameter (k) of the negative binomial. "Poisson" uses Poisson error for residuals. "Gauss" uses Gaussian error for residuals with constant standard deviation across groups. "GaussMultResid" uses Gaussian error for residuals, with standard deviation a constant fraction of the model's prediction (and thus only appropriate if predictions are strictly positive). "Flat" is a trivial model where the same likelihood is returned regardless of parameters or data. It is for testing how parameter search behaves in absence of data, as for describing an implied prior.
`includeCovar`	TRUE or FALSE, whether to fit the full covariance matrix, vs. variances alone.
`update`	'conjugate' or 'metropolis', whether to use inverse-gamma (or inverse-Wishart for full covariance) vs. metropolis steps for updating covariances.
`badparam`	The name of a function (unquoted) that tests a set of model parameters for validity; must return TRUE if parameters are valid, otherwise FALSE.
`sdfunc`	The name of a function (unquoted) that models the residual standard deviation as a function of the x's, just like the model function. The default uses the function named constant, meaning the standard deviation is the same for all values of x. Parameters for this function are estimated, just as parameters for the model function are.
`badSDparam`	The name of a function which tests for invalid parameters for sdfunc, returning TRUE or FALSE (analogous to badparam); a simple version is provided, called badSD, which rejects a single parameter if it is < 0.
`paramfile`	The name of a file where the entire MCMC chain of parameter values is stored at regular intervals; when parameters are written to the file, they are erased from memory, thus removing the need for the entire chain of all parameters being stored at once while the model is running.
`savestep`	Parameters are appended to paramfile every savestep steps; must be < steps.
`steps`	The number of steps to run the Gibbs sampler.
`showstep`	Information is printed to the screen every showstep steps.
`burnin`	The number of steps to remove as burn-in before calculating posterior distributions; not that all parameters are saved and returned regardless.
`debug`	Logical. If TRUE, call browser to debug.
`...`	The typical R means for submitting additional parameters for various functions used in the model (`model`, `sdfunc`, `badparam`, `badSDparam`).

Data are submitted the way lm or lmer require, with one single table, one row per observation; the random effects are in one column. The formula, however, is not submitted using the R-style 'squiggle'~. Rather, the names of x, y, and random columns, are given. The model describing y's function of the x's is passed, and must be provided by the user (several are available within the CTFS R Package, though, in the Utilities topic). Examples below will serve to explain.

As in lmer, all parameters of the model follow a Gaussian hyperdistribution across the random effects. There is an option to include a full covariance matrix as the hyperdistribution, otherwise, only the variances are fit (ie, the covariance matrix has only zeroes off-diagonal). There is also an option to use the conjugate inverse-gamma or inverse-wishart for the variances and covariances; otherwise, Metropolis steps are used.

A starting set of parameters for the model must be submitted. It can be a vector as long as the number of parameters required by the model, or it can be a full matrix, with one row of parameters for each of the random effects. The latter requires knowing in advance the names of all the random effects.

There is a further complication included whose purpose is reducing memory demand in big models with many MCMC steps. option paramfile allows the full parameter matrix to be written into a text file every savestep steps, then erased from memory.

This is to reduce memory needs. The function summaryMCMC restores the parameters from the text file into an giant R array.

Further details are given in the description of all the arguments and the sample here, plus a tutorial on Mortality changes offers a worked example.

A list with several components:

mu: A 2D array with the entire chain of model parameters (ie, fixed effects) from the Gibbs sampler
sigma: A 3D array with the entire chain of covariances from the Gibbs sampler; if includeCovar==FALSE, only the diagonal is non-zero
bestmu: Best estimate of the model parameters for the entire data (ie, fixed effect)
bestsigma: Best estimate of the covariance (ie, group-level variance or error)
resid: The entire chain parameters for the model of residuals
bestresid: The best estimate of parameters for the model of residuals
CIresid: Credible intervals for the parameters for the model of residuals
best: The best estimates of model parameters for each random effect
lower: Lower credible intervals of model parameters for each random effect
upper: Uower credible intervals of model parameters for each random effect
burn: The burn-in
llike: Full log-likelihood of the model at each step of the Gibbs'sampler
bestlike: The log-likelihood of the optimal parameter combination (means of the posterior distribution)
DIC: Deviance information criterion of the model
obs: The original y (dependent) variable, just as submitted
data: The original x (independent) variables, just as submitted
model: The model's predictions, as a list with one element per random effect
randlike: The log-likelihood of observations for each random effect given the optimal parameters (a vector, one per random effect)
keep: The steps of the Gibbs sampler after burn-in, as a vector of negative numbers
start: The start parameters submitted
randeffects: The names of all the random effects
parnames: The names of the model parameters
fullparam: A 3D array with all parameters of the Gibbs sampler; one dimension if for all the random effects, with each random effect having a matrix of model parameters for every step of the Gibbs's sampler

start Apart from a vector, it can be a matrix of such vectors, one per random effect.

## Not run: 
# Assume two plot datasets from BCI are available, bciex::bci12t6mini and
bciex::bci12t7mini.
# Subset to trees above 10 cm dbh and just 10 species for illustration (the
# model will run much faster). The fixed effect, species - level variation
# (or error), and the model parameters for each species are shown below. 
# Check the names of the result to see what else lmerBayes returns.

gtable = growth.indiv(bciex::bci12t6mini, bciex::bci12t7mini, mindbh = 100)
a_few_species = c(
  'termam',
  'tachve',
  'pri2co',
  'gustsu',
  'cecrin',
  'tet2pa',
  'guatdu',
  'vochfe',
  'virose',
  'maquco'
)
gtable = subset(gtable, !is.na(incgr) & sp %in% a_few_species)
mod = lmerBayes(
  data = gtable,
  ycol = 'incgr',
  xcol = 'dbh1',
  randcol = 'sp',
  start = c(1, 0),
  startSD = 1,
  startCov = 1,
  model = linear.model,
  error = 'Gauss',
  includeCovar = FALSE,
  badSDparam = badSD,
  steps = 1100,
  showstep = 50,
  burnin = 100
)
mod$bestmu
diag(sqrt(mod$bestsigma))
mod$best
names(mod)

## End(Not run)