mr.ash | R Documentation |
Model fitting algorithms for Multiple Regression with Adaptive Shrinkage ("Mr.ASH"). Mr.ASH is a variational empirical Bayes (VEB) method for multiple linear regression. The fitting algorithms (locally) maximize the approximate marginal likelihood (the "evidence lower bound", or ELBO) via coordinate-wise updates.
mr.ash(
X,
y,
Z = NULL,
sa2 = NULL,
method_q = c("sigma_dep_q", "sigma_indep_q"),
method_g = c("caisa", "accelerate", "block"),
max.iter = 1000,
min.iter = 1,
beta.init = NULL,
update.pi = TRUE,
pi = NULL,
update.sigma2 = TRUE,
sigma2 = NULL,
update.order = NULL,
standardize = FALSE,
intercept = TRUE,
tol = set_default_tolerance()
)
X |
The input matrix, of dimension (n,p); each column is a single predictor; and each row is an observation vector. Here, n is the number of samples and p is the number of predictors. The matrix cannot be sparse. |
y |
The observed continuously-valued responses, a vector of length p. |
Z |
The covariate matrix, of dimension (n,k), where k is the
number of covariates. This feature is not yet implemented;
|
sa2 |
The vector of prior mixture component variances. The
variances should be in increasing order, starting at zero; that is,
|
method_q |
The algorithm used to update the variational
approximation to the posterior distribution of the regression
coefficients, |
method_g |
|
max.iter |
The maximum number of outer loop iterations allowed. |
min.iter |
The minimum number of outer loop iterations allowed. |
beta.init |
The initial estimate of the (approximate)
posterior mean regression coefficients. This should be |
update.pi |
If |
pi |
The initial estimate of the mixture proportions
|
, where
K = length(sa2).
update.sigma2 |
If |
sigma2 |
The initial estimate of the residual variance,
|
update.order |
The order in which the co-ordinate ascent
updates for estimating the posterior mean coefficients are
performed. |
standardize |
The logical flag for standardization of the columns of X variable, prior to the model fitting. The coefficients are always returned on the original scale. |
intercept |
When |
tol |
Additional settings controlling behaviour of the model
fitting algorithm. |
Mr.ASH is a statistical inference method for the following multiple linear regression model:
y | X, \beta, \sigma^2 ~
N(X \beta, \sigma I_n),
in which the regression coefficients
\beta
admit a mixture-of-normals prior,
\beta | \pi,
\sigma ~ g = \sum_{k=1}^K N(0, \sigma^2 \sigma_k^2).
Each mixture
component in the prior, g
, is a normal density centered at
zero, with variance \sigma^2 \sigma_k^2
.
The fitting algorithm, it run for a large enough number of
iterations, will find an approximate posterior for the regression
coefficients, denoted by q(\beta)
, residual variance
parameter sigma^2
, and prior mixture weights \pi_1,
\ldots, \pi_K
maximizing the evidence lower bound,
F(q, \pi,
\sigma^2) = E_q \log p(y | X, \beta, \sigma^2) - \sum_{j=1}^p
D_{KL}(q_j || g),
where D_{KL}(q_j || g)
denotes the
Kullback-Leibler (KL) divergence, a measure of the "distance"
between (approximate) posterior q_j(\beta_j)
and prior
g(\beta_j)
. The fitting algorithm iteratively updates the
approximate posteriors q_1, \ldots, q_p
, separately for each
j = 1, \ldots, p
(in an order determined by
update.order
), then separately updates the mixture weights
and \pi
and residual variance \sigma^2
. This
coordinate-wise update scheme iterates until the convergence
criterion is met, or until the algorithm hits an upper bound on
the number of iterations (specified by max.iter
).
See ‘References’ for more details about the model and algorithm.
A list object with the following elements:
intercept |
The estimated intercept. |
beta |
A vector containing posterior mean estimates of the regression coefficients for all predictors. |
sigma2 |
The estimated residual variance. |
pi |
A vector of containing the estimated mixture proportions. |
iter |
The number of outer-loop iterations that were performed. |
update.order |
The ordering used for performing the
coordinate-wise updates. For |
varobj |
A vector of length |
data |
The preprocessed data (X, Z, y) provided as input to the model
fitting algorithm. |
Y. Kim (2020), Bayesian shrinkage methods for high dimensional regression. Ph.D. thesis, University of Chicago.
get.full.posterior
, predict.mr.ash
### generate synthetic data
set.seed(1)
n = 200
p = 300
X = matrix(rnorm(n*p),n,p)
beta = double(p)
beta[1:10] = 1:10
y = X %*% beta + rnorm(n)
### fit Mr.ASH
fit.mr.ash = mr.ash(X,y, method_q = "sigma_indep_q")
fit.mr.ash = mr.ash(X,y, method_q = "sigma_scaled_beta")
fit.mr.ash = mr.ash(X,y, method_q = "sigma_dep_q")
### prediction routine
Xnew = matrix(rnorm(n*p),n,p)
ynew = Xnew %*% beta + rnorm(n)
ypred = predict(fit.mr.ash, Xnew)
### test error
rmse = norm(ynew - ypred, '2') / sqrt(n)
### coefficients
betahat = predict(fit.mr.ash, type = "coefficients")
# this equals c(fit.mr.ash$intercept, fit.mr.ash$beta)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.