mr.ash: Multiple Regression with Adaptive Shrinkage
In stephenslab/mr.ash.alpha: Multiple Regression with Adaptive Shrinkage

View source: R/mr.ash.R

mr.ash

R Documentation

Multiple Regression with Adaptive Shrinkage

Description

Model fitting algorithms for Multiple Regression with Adaptive Shrinkage ("Mr.ASH"). Mr.ASH is a variational empirical Bayes (VEB) method for multiple linear regression. The fitting algorithms (locally) maximize the approximate marginal likelihood (the "evidence lower bound", or ELBO) via coordinate-wise updates.

Usage

mr.ash(
  X,
  y,
  Z = NULL,
  sa2 = NULL,
  method_q = c("sigma_dep_q", "sigma_indep_q"),
  method_g = c("caisa", "accelerate", "block"),
  max.iter = 1000,
  min.iter = 1,
  beta.init = NULL,
  update.pi = TRUE,
  pi = NULL,
  update.sigma2 = TRUE,
  sigma2 = NULL,
  update.order = NULL,
  standardize = FALSE,
  intercept = TRUE,
  tol = set_default_tolerance()
)

Arguments

`X`	The input matrix, of dimension (n,p); each column is a single predictor; and each row is an observation vector. Here, n is the number of samples and p is the number of predictors. The matrix cannot be sparse.
`y`	The observed continuously-valued responses, a vector of length p.
`Z`	The covariate matrix, of dimension (n,k), where k is the number of covariates. This feature is not yet implemented; `Z` must be set to `NULL`.
`sa2`	The vector of prior mixture component variances. The variances should be in increasing order, starting at zero; that is, `sort(sa2)` should be the same as `sa2`. When `sa2` is `NULL`, the default setting is used, `sa2[k] = (2^(0.05*(k-1)) - 1)^2`, for `k = 1:20`. For this default setting, `sa2[1] = 0`, and `sa2[20]` is roughly 1.
`method_q`	The algorithm used to update the variational approximation to the posterior distribution of the regression coefficients, `method = "sigma_dep_q"`, `method = "sigma_indep_q"` and `"sigma_scaled_beta"`, take different approaches to updating the residual variance `sigma^2`.
`method_g`	`method = "caisa"`, an abbreviation of "Cooridinate Ascent Iterative Shinkage Algorithm", fits the model by approximate EM; it iteratively updates the variational approximation to the posterior distribution of the regression coefficients (the approximate E-step) and the model parameters (mixture weights and residual covariance) in an approximate M-step. Settings `method = "block"` and `method = "accelerate"` are considered experimental.
`max.iter`	The maximum number of outer loop iterations allowed.
`min.iter`	The minimum number of outer loop iterations allowed.
`beta.init`	The initial estimate of the (approximate) posterior mean regression coefficients. This should be `NULL`, or a vector of length p. When `beta.init` is `NULL`, the posterior mean coefficients are all initially set to zero.
`update.pi`	If `update.pi = TRUE`, the mixture proportions in the mixture-of-normals prior are estimated from the data. In the manuscript, `update.pi = TRUE`.
`pi`	The initial estimate of the mixture proportions `\pi_1, \ldots, \pi_K`. If `pi` is `NULL`, the mixture weights are initialized to `rep(1/K,K)`

, where K = length(sa2).

`update.sigma2`	If `update.sigma2 = TRUE`, the residual variance `sigma^2` is estimated from the data. In the manuscript, `update.sigma = TRUE`.
`sigma2`	The initial estimate of the residual variance, `\sigma^2`. If `sigma2 = NULL`, the residual variance is initialized to the empirical variance of the residuals based on the initial estimates of the regression coefficients, `beta.init`, after removing linear effects of the intercept and any covariances.
`update.order`	The order in which the co-ordinate ascent updates for estimating the posterior mean coefficients are performed. `update.order` can be `NULL`, `"random"`, or any permutation of `(1,\ldots,p)`, where `p` is the number of columns in the input matrix `X`. When `update.order` is `NULL`, the co-ordinate ascent updates are performed in order in which they appear in `X`; this is equivalent to setting `update.order = 1:p`. When `update.order = "random"`, the co-ordinate ascent updates are performed in a randomly generated order, and this random ordering is different at each outer-loop iteration.
`standardize`	The logical flag for standardization of the columns of X variable, prior to the model fitting. The coefficients are always returned on the original scale.
`intercept`	When `intercept = TRUE`, an intercept is included in the regression model.
`tol`	Additional settings controlling behaviour of the model fitting algorithm. `tol$convtol` controls the termination criterion for the model fitting. When `update.pi = TRUE`, the outer-loop updates stop when the largest change in the mixture weights is less than `convtolK`; when `update.pi = FALSE`, the outer-loop updates stop when the largest change in the estimates of the posterior mean coefficients is less than `convtolK`. `tol$epstol` is a small, positive number added to the likelihoods to avoid logarithms of zero.

Details

Mr.ASH is a statistical inference method for the following multiple linear regression model:

y | X, \beta, \sigma^2 ~ N(X \beta, \sigma I_n),

in which the regression coefficients \beta admit a mixture-of-normals prior,

\beta | \pi, \sigma ~ g = \sum_{k=1}^K N(0, \sigma^2 \sigma_k^2).

Each mixture component in the prior, g, is a normal density centered at zero, with variance \sigma^2 \sigma_k^2.

The fitting algorithm, it run for a large enough number of iterations, will find an approximate posterior for the regression coefficients, denoted by q(\beta), residual variance parameter sigma^2, and prior mixture weights \pi_1, \ldots, \pi_K maximizing the evidence lower bound,

F(q, \pi, \sigma^2) = E_q \log p(y | X, \beta, \sigma^2) - \sum_{j=1}^p D_{KL}(q_j || g),

where D_{KL}(q_j || g) denotes the Kullback-Leibler (KL) divergence, a measure of the "distance" between (approximate) posterior q_j(\beta_j) and prior g(\beta_j). The fitting algorithm iteratively updates the approximate posteriors q_1, \ldots, q_p, separately for each j = 1, \ldots, p (in an order determined by update.order), then separately updates the mixture weights and \pi and residual variance \sigma^2. This coordinate-wise update scheme iterates until the convergence criterion is met, or until the algorithm hits an upper bound on the number of iterations (specified by max.iter).

See ‘References’ for more details about the model and algorithm.

Value

A list object with the following elements:

`intercept`	The estimated intercept.
`beta`	A vector containing posterior mean estimates of the regression coefficients for all predictors.
`sigma2`	The estimated residual variance.
`pi`	A vector of containing the estimated mixture proportions.
`iter`	The number of outer-loop iterations that were performed.
`update.order`	The ordering used for performing the coordinate-wise updates. For `update.order = "random"`, the orderings for outer-loop iterations are provided in a vector of length `p*max.iter`, where `p` is the number of predictors.
`varobj`	A vector of length `numiter`, containing the value of the variational objective (equal to the negative "evidence lower bound") attained at each (outer-loop) model fitting iteration. Note that the objective does not account for the intercept term, even when `intercept = TRUE`; therefore, this value shoudl be interpreted as being an approximation to the marginal likelihood conditional on the estimate of the intercept.
`data`	The preprocessed data (X, Z, y) provided as input to the model fitting algorithm. `data$w` is equal to `diag(crossprod(X))`, in which `X` is the preprocessed data matrix. Additionally, `data$sa2` gives the prior variances used.

References

Y. Kim (2020), Bayesian shrinkage methods for high dimensional regression. Ph.D. thesis, University of Chicago.

Examples

### generate synthetic data
set.seed(1)
n           = 200
p           = 300
X           = matrix(rnorm(n*p),n,p)
beta        = double(p)
beta[1:10]  = 1:10
y           = X %*% beta + rnorm(n)

### fit Mr.ASH
fit.mr.ash  = mr.ash(X,y, method_q = "sigma_indep_q")
fit.mr.ash  = mr.ash(X,y, method_q = "sigma_scaled_beta")
fit.mr.ash  = mr.ash(X,y, method_q = "sigma_dep_q")

### prediction routine
Xnew        = matrix(rnorm(n*p),n,p)
ynew        = Xnew %*% beta + rnorm(n)
ypred       = predict(fit.mr.ash, Xnew)

### test error
rmse        = norm(ynew - ypred, '2') / sqrt(n)

### coefficients
betahat     = predict(fit.mr.ash, type = "coefficients")
# this equals c(fit.mr.ash$intercept, fit.mr.ash$beta)

stephenslab/mr.ash.alpha documentation built on Oct. 31, 2023, 4:21 p.m.