nested.regression: Nested Regression
In BoomSpikeSlab: MCMC for Spike and Slab Regression

nested.regression

R Documentation

Nested Regression

Description

Fits a Bayesian hierarchical regression model to data nested within groups. The model is

% y_{ig} \sim N(x_i \beta_g, \sigma^2) \\ % 1 / \sigma^2 \sim Gamma(df/2, ss/2) \\ % \beta_g \sim N(b, V) \\ %

Optional hyperprior distributions can be supplied to the prior parameters.

% b ~ N(prior.mean, prior.variance) \\ % V ~ InverseWishart(df, variance.guess). \\ %

Either hyperprior can be omitted, in which case the corresponding prior parameter is assumed fixed at the user-supplied value.

Usage

NestedRegression(response,
                 predictors,
                 group.id,
                 residual.precision.prior = NULL,
                 coefficient.prior = NULL,
                 coefficient.mean.hyperprior = NULL,
                 coefficient.variance.hyperprior = NULL,
                 suf = NULL,
                 niter,
                 ping = niter / 10,
                 sampling.method = c("ASIS", "DA"),
                 seed = NULL)

Arguments

`response`	A numeric vector. The response variable to be modeled.
`predictors`	A numeric matrix of predictor variables, including an intercept term if one is desired. The number of rows must match length(response).
`group.id`	A factor (or object that can be converted using `as.factor`) naming the group to which each entry in `response` belongs.
`residual.precision.prior`	An object of type `SdPrior` describing the prior distribution of the residual standard deviation.
`coefficient.prior`	An object of class MvnPrior, or `NULL`. If non-`NULL` this gives the initial values of the prior distribution of the regression coefficients in the nested regression model. This argument must be non-`NULL` if either `coefficient.mean.hyperprior` or `coefficient.variance.hyperprior` is `NULL`.
`coefficient.mean.hyperprior`	An object of class `MvnPrior`, specifying the hyperprior distribution for the mean of `coefficient.prior`. This argument can also be `NULL`, or `FALSE`. If `NULL` then a default prior will be used when learning the mean of the prior distribution. If `FALSE` then the mean of the prior distribution will not be learned; the mean of the `coefficient.prior` distribution will be assumed instead.
`coefficient.variance.hyperprior`	An object of class `InverseWishartPrior`, specifying the hyperprior distribution for the variance of `coefficient.prior`. This argument can also be `NULL`, or `FALSE`. If `NULL` then a default prior will be used when learning the variance of the prior distribution. If `FALSE` then the variance of the prior distribution will not be learned; the variance of the `coefficient.prior` distribution will be assumed instead.
`suf`	A list, where each entry is of type `RegressionSuf`, giving the sufficient statistics for each group, or `NULL`. If `NULL`, then `suf` will be computed from `response`, `predictors`, and `group.id`. If non-`NULL` then these arguments will not be accessed, in which case they can be left unspecified. In 'big data' problems this can be a significant computational savings.
`niter`	The desired number of MCMC iterations.
`ping`	The frequency with which to print status updates.
`sampling.method`	The MCMC sampling scheme that should be used. If either hyperprior is set to `FALSE` then the "DA" method will be used.
`seed`	The integer-valued seed (or `NULL`) to use for the C++ random number generator.

Details

Note: ASIS (Yu and Meng, 2011) has slightly better MCMC convergence, but is slightly slower than the classic DA (data augmentation) method, which alternates between sampling group-level regression coefficients and prior parameters. Both methods are pretty fast.

Value

A list containing MCMC draws from the posterior distribution of model parameters. Each of the following is a vector, matrix, or array, with first index corresponding to MCMC draws, and later indices to distinct parameters.

coefficients: regression coefficients.
residual.sd: the residual standard deviation from the regression model.
prior.mean: The posterior distribution of the coefficient means across groups.
prior.variance: The posterior distribution of the variance matrix describing the distribution of regression coefficients across groups.
priors: A list of the prior distributions used to fit the model.

Author(s)

Steven L. Scott

Examples

SimulateNestedRegressionData <- function() {
  beta.hyperprior.mean <- c(8, 6, 7, 5)
  xdim <- length(beta.hyperprior.mean)
  beta.hyperprior.variance <-
    rWishart(2 * xdim, diag(rep(1, xdim)), inverse = TRUE)

  number.of.groups <- 27
  nobs.per.group = 23
  beta <- rmvn(number.of.groups,
               beta.hyperprior.mean,
               beta.hyperprior.variance)

  residual.sd <- 2.4
  X <- cbind(1, matrix(rnorm(number.of.groups * (xdim - 1) * nobs.per.group),
                       ncol = xdim - 1))
  group.id <- rep(1:number.of.groups, len = nrow(X))
  y.hat <- numeric(nrow(X))
  for (i in 1:nrow(X)) {
    y.hat[i] = sum(X[i, ] * beta[group.id[i], ])
  }
  y <- rnorm(length(y.hat), y.hat, residual.sd)
  suf <- BoomSpikeSlab:::.RegressionSufList(X, y, group.id)

  return(list(beta.hyperprior.mean = beta.hyperprior.mean,
              beta.hyperprior.variance = beta.hyperprior.variance,
              beta = beta,
              residual.sd = residual.sd,
              X = X,
              y = y,
              group.id = group.id,
              suf = suf))
}

d <- SimulateNestedRegressionData()
model <- NestedRegression(suf = d$suf, niter = 500)

BoomSpikeSlab documentation built on May 29, 2024, 5:07 a.m.