ebreg: Implements the empirical Bayes method in high-dimensional...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/ebproject.R

Description

The function ebreg implements the method first presented in Martin, Mess, and Walker (2017) for Bayesian inference and variable selection in the high-dimensional sparse linear regression problem. The chief novelty is the manner in which the prior distribution for the regression coefficients depends on data; more details, with a focus on the prediction problem, are given in Martin and Tang (2019).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
ebreg(
  y,
  X,
  XX,
  standardized = TRUE,
  alpha = 0.99,
  gam = 0.005,
  sig2,
  prior = TRUE,
  igpar = c(0.01, 4),
  log.f,
  M,
  sample.beta = FALSE,
  pred = FALSE,
  conf.level = 0.95
)

Arguments

y

vector of response variables for regression

X

matrix of predictor variables

XX

vector to predict outcome variable, if pred=TRUE

standardized

logical. If TRUE, the data provided has already been standardized

alpha

numeric value between 0 and 1, likelihood fraction. Default is 0.99

gam

numeric value between 0 and 1, conditional prior precision parameter. Default is 0.005

sig2

numeric value for error variance. If NULL (default), variance is estimated from data

prior

logical. If TRUE, a prior is used for the error variance

igpar

the parameters for the inverse gamma prior on the error variance. Default is (0.01,4)

log.f

log of the prior for the model size

M

integer value to indicate the Monte Carlo sample size (burn-in of size 0.2 * M automatically added)

sample.beta

logical. If TRUE, samples of beta are obtained

pred

logical. If TRUE, predictions are obtained

conf.level

numeric value between 0 and 1, confidence level for the marginal credible interval if sample.beta=TRUE, and for the prediction interval if pred=TRUE

Details

Consider the classical regression problem

y = Xβ + σ ε,

where y is a n-vector of responses, X is a n x p matrix of predictor variables, β is a p-vector of regression coefficients, σ > 0 is a scale parameter, and ε is a n-vector of independent and identically distributed standard normal random errors. Here we allow p ≥ n (or even p >> n) and accommodate the high dimensionality by assuming β is sparse in the sense that most of its components are zero. The approach described in Martin, Mess, and Walker (2017) and in Martin and Tang (2019) starts by decomposing the full β vector as a pair (S, β_S) where S is a subset of indices 1,2,…,p that represents the location of active variables and β_S is the |S|-vector of non-zero coefficients. The approach proceeds by specifying a prior distribution for S and then a conditional prior distribution for β_S, given S. This latter prior distribution here is taken to depend on data, hence "empirical". A prior distribution for σ^2 can also be introduced, and this option is included in the function.

Value

A list with components

Author(s)

Yiqi Tang

Ryan Martin

References

\insertRef

martin.mess.walker.ebebreg

\insertRef

martin2019empiricalebreg

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
n <- 70
p <- 100
beta <- rep(1, 5)
s0 <- length(beta)
sig2 <- 1
d <- 1
log.f <- function(x) -x * (log(1) + 0.05 * log(p)) + log(x <= n)
X <- matrix(rnorm(n * p), nrow=n, ncol=p)
X.new <- matrix(rnorm(p), nrow=1, ncol=p)
y <- as.numeric(X[, 1:s0] %*% beta[1:s0]) + sqrt(sig2) * rnorm(n)

o<-ebreg(y, X, X.new, TRUE, .99, .005, NULL, FALSE, igpar=c(0.01, 4),
log.f, M=5000, TRUE, FALSE, .95)

incl.pr <- o$incl.prob
plot(incl.pr, xlab="Variable Index", ylab="Inclusion Probability", type="h", ylim=c(0,1))

ebreg documentation built on May 26, 2021, 5:07 p.m.