mr.ash: Multiple Regression with Adaptive Shrinkage

View source: R/mr.ash.R

mr.ashR Documentation

Multiple Regression with Adaptive Shrinkage

Description

Model fitting algorithms for Multiple Regression with Adaptive Shrinkage ("Mr.ASH"). Mr.ASH is a variational empirical Bayes (VEB) method for multiple linear regression. The fitting algorithms (locally) maximize the approximate marginal likelihood (the "evidence lower bound", or ELBO) via coordinate-wise updates.

Usage

mr.ash(
  X,
  y,
  Z = NULL,
  sa2 = NULL,
  method_q = c("sigma_dep_q", "sigma_indep_q"),
  method_g = c("caisa", "accelerate", "block"),
  max.iter = 1000,
  min.iter = 1,
  beta.init = NULL,
  update.pi = TRUE,
  pi = NULL,
  update.sigma2 = TRUE,
  sigma2 = NULL,
  update.order = NULL,
  standardize = FALSE,
  intercept = TRUE,
  tol = set_default_tolerance()
)

Arguments

X

The input matrix, of dimension (n,p); each column is a single predictor; and each row is an observation vector. Here, n is the number of samples and p is the number of predictors. The matrix cannot be sparse.

y

The observed continuously-valued responses, a vector of length p.

Z

The covariate matrix, of dimension (n,k), where k is the number of covariates. This feature is not yet implemented; Z must be set to NULL.

sa2

The vector of prior mixture component variances. The variances should be in increasing order, starting at zero; that is, sort(sa2) should be the same as sa2. When sa2 is NULL, the default setting is used, sa2[k] = (2^(0.05*(k-1)) - 1)^2, for k = 1:20. For this default setting, sa2[1] = 0, and sa2[20] is roughly 1.

method_q

The algorithm used to update the variational approximation to the posterior distribution of the regression coefficients, method = "sigma_dep_q", method = "sigma_indep_q" and "sigma_scaled_beta", take different approaches to updating the residual variance sigma^2.

method_g

method = "caisa", an abbreviation of "Cooridinate Ascent Iterative Shinkage Algorithm", fits the model by approximate EM; it iteratively updates the variational approximation to the posterior distribution of the regression coefficients (the approximate E-step) and the model parameters (mixture weights and residual covariance) in an approximate M-step. Settings method = "block" and method = "accelerate" are considered experimental.

max.iter

The maximum number of outer loop iterations allowed.

min.iter

The minimum number of outer loop iterations allowed.

beta.init

The initial estimate of the (approximate) posterior mean regression coefficients. This should be NULL, or a vector of length p. When beta.init is NULL, the posterior mean coefficients are all initially set to zero.

update.pi

If update.pi = TRUE, the mixture proportions in the mixture-of-normals prior are estimated from the data. In the manuscript, update.pi = TRUE.

pi

The initial estimate of the mixture proportions \pi_1, \ldots, \pi_K. If pi is NULL, the mixture weights are initialized to rep(1/K,K)

, where K = length(sa2).

update.sigma2

If update.sigma2 = TRUE, the residual variance sigma^2 is estimated from the data. In the manuscript, update.sigma = TRUE.

sigma2

The initial estimate of the residual variance, \sigma^2. If sigma2 = NULL, the residual variance is initialized to the empirical variance of the residuals based on the initial estimates of the regression coefficients, beta.init, after removing linear effects of the intercept and any covariances.

update.order

The order in which the co-ordinate ascent updates for estimating the posterior mean coefficients are performed. update.order can be NULL, "random", or any permutation of (1,\ldots,p), where p is the number of columns in the input matrix X. When update.order is NULL, the co-ordinate ascent updates are performed in order in which they appear in X; this is equivalent to setting update.order = 1:p. When update.order = "random", the co-ordinate ascent updates are performed in a randomly generated order, and this random ordering is different at each outer-loop iteration.

standardize

The logical flag for standardization of the columns of X variable, prior to the model fitting. The coefficients are always returned on the original scale.

intercept

When intercept = TRUE, an intercept is included in the regression model.

tol

Additional settings controlling behaviour of the model fitting algorithm. tol$convtol controls the termination criterion for the model fitting. When update.pi = TRUE, the outer-loop updates stop when the largest change in the mixture weights is less than convtol*K; when update.pi = FALSE, the outer-loop updates stop when the largest change in the estimates of the posterior mean coefficients is less than convtol*K. tol$epstol is a small, positive number added to the likelihoods to avoid logarithms of zero.

Details

Mr.ASH is a statistical inference method for the following multiple linear regression model:

y | X, \beta, \sigma^2 ~ N(X \beta, \sigma I_n),

in which the regression coefficients \beta admit a mixture-of-normals prior,

\beta | \pi, \sigma ~ g = \sum_{k=1}^K N(0, \sigma^2 \sigma_k^2).

Each mixture component in the prior, g, is a normal density centered at zero, with variance \sigma^2 \sigma_k^2.

The fitting algorithm, it run for a large enough number of iterations, will find an approximate posterior for the regression coefficients, denoted by q(\beta), residual variance parameter sigma^2, and prior mixture weights \pi_1, \ldots, \pi_K maximizing the evidence lower bound,

F(q, \pi, \sigma^2) = E_q \log p(y | X, \beta, \sigma^2) - \sum_{j=1}^p D_{KL}(q_j || g),

where D_{KL}(q_j || g) denotes the Kullback-Leibler (KL) divergence, a measure of the "distance" between (approximate) posterior q_j(\beta_j) and prior g(\beta_j). The fitting algorithm iteratively updates the approximate posteriors q_1, \ldots, q_p, separately for each j = 1, \ldots, p (in an order determined by update.order), then separately updates the mixture weights and \pi and residual variance \sigma^2. This coordinate-wise update scheme iterates until the convergence criterion is met, or until the algorithm hits an upper bound on the number of iterations (specified by max.iter).

See ‘References’ for more details about the model and algorithm.

Value

A list object with the following elements:

intercept

The estimated intercept.

beta

A vector containing posterior mean estimates of the regression coefficients for all predictors.

sigma2

The estimated residual variance.

pi

A vector of containing the estimated mixture proportions.

iter

The number of outer-loop iterations that were performed.

update.order

The ordering used for performing the coordinate-wise updates. For update.order = "random", the orderings for outer-loop iterations are provided in a vector of length p*max.iter, where p is the number of predictors.

varobj

A vector of length numiter, containing the value of the variational objective (equal to the negative "evidence lower bound") attained at each (outer-loop) model fitting iteration. Note that the objective does not account for the intercept term, even when intercept = TRUE; therefore, this value shoudl be interpreted as being an approximation to the marginal likelihood conditional on the estimate of the intercept.

data

The preprocessed data (X, Z, y) provided as input to the model fitting algorithm. data$w is equal to diag(crossprod(X)), in which X is the preprocessed data matrix. Additionally, data$sa2 gives the prior variances used.

References

Y. Kim (2020), Bayesian shrinkage methods for high dimensional regression. Ph.D. thesis, University of Chicago.

See Also

get.full.posterior, predict.mr.ash

Examples

### generate synthetic data
set.seed(1)
n           = 200
p           = 300
X           = matrix(rnorm(n*p),n,p)
beta        = double(p)
beta[1:10]  = 1:10
y           = X %*% beta + rnorm(n)

### fit Mr.ASH
fit.mr.ash  = mr.ash(X,y, method_q = "sigma_indep_q")
fit.mr.ash  = mr.ash(X,y, method_q = "sigma_scaled_beta")
fit.mr.ash  = mr.ash(X,y, method_q = "sigma_dep_q")

### prediction routine
Xnew        = matrix(rnorm(n*p),n,p)
ynew        = Xnew %*% beta + rnorm(n)
ypred       = predict(fit.mr.ash, Xnew)

### test error
rmse        = norm(ynew - ypred, '2') / sqrt(n)

### coefficients
betahat     = predict(fit.mr.ash, type = "coefficients")
# this equals c(fit.mr.ash$intercept, fit.mr.ash$beta)


stephenslab/mr.ash.alpha documentation built on Oct. 31, 2023, 4:21 p.m.