Description Usage Arguments Details Value Author(s) References Examples
This function implements the Dirichlet-Laplace model of Bhattacharya et al. (2015) for obtaining a sparse estimate of θ = (θ_1, ..., θ_n) in the normal means problem,
X_i = θ_i + ε_i,
where ε_i \sim N(0, σ^2). This is achieved by placing the Dirichlet-Laplace (DL) prior on the individual θ_i's. The sparsity parameter τ in the Dir(τ, …, τ) prior can be specified a priori, or it can be estimated from the data, either by: 1) using the estimate of sparsity level by van der Pas et al. (2014), 2) by taking a restricted marginal maximum likelihood (REML) estimate on [1/n, 1], 3) endowing τ with a uniform prior, U(1/n, 1), or 4) endowing τ with a standard Cauchy prior truncated to [1/n, 1]. Multiple testing can also be performed by either thresholding the shrinkage factor in the posterior mean, or by examining the marginal 95 percent credible intervals.
1 2 3 4 | dl.normalmeans(x, tau.est=c("fixed", "est.sparsity", "reml", "uniform",
"truncatedCauchy"), tau=1/length(x), sigma2=1,
var.select = c("threshold", "intervals"),
max.steps=10000, burnin=5000)
|
x |
an n \times 1 multivariate normal vector. |
tau.est |
The method for estimating the sparsity parameter τ. If |
tau |
The concentration parameter τ in the Dir(τ, …, τ) prior. Controls the sparsity of the model. Defaults to 1/n, but user may specify a different value for |
sigma2 |
The variance parameter. Defaults to 1. User needs to specify the noise parameter if it is different from 1 (σ^2 > 0). |
var.select |
The method of variable selection. |
max.steps |
The total number of iterations to run in the Gibbs sampler. Defaults to 10,000. |
burnin |
The number of burn-in iterations for the Gibbs sampler. Defaults to 5,000. |
The function implements sparse estimation and multiple hypothesis testing on a multivariate normal mean vector, θ = (θ_1, …, θ_n) with the Dirichlet-Laplace prior of Bhattacharya et al. (2015). The full model is:
X | θ \sim N_n( θ, σ^2 I_n),
θ_i | (ψ_i, φ_i, ω) \sim N(0, σ^2 ψ_i φ_i^2 ω^2), i = 1, …, n,
ψ_i \sim Exp(1/2), i = 1, …, n,
(φ_1, …, φ_n) \sim Dir(τ, …, τ),
ω \sim G(n τ, 1/2).
τ is the main parameter that controls the sparsity of the solution. It can be estimated by: the empirical Bayes estimate of the estimated sparsity ("est.sparsity"
) given in van der Pas et al. (2014), restricted marginal maximum likelihood in the interval [1/n, 1] ("reml"
), a uniform prior τ \sim U(1/n, 1) ("uniform"
), or by a standard Cauchy prior truncated to [1/n, 1] ("truncatedCauchy"
).
The posterior mean is of the form [E(1-κ_i | X_1, …, X_n)] X_i, i = 1, …, n. The "threshold" method for variable selection is to classify θ_i as signal (θ_i \neq 0) if E(1 - κ_i | X_1, …, X_n) > 1/2.
The function returns a list containing the following components:
theta.hat |
The posterior mean of θ. |
theta.med |
The posterior median of θ. |
theta.var |
The posterior variance estimates for each θ_i, i=1,…,p. |
theta.intervals |
The 95 percent credible intervals for all n components of θ. |
dl.classifications |
An n-dimensional binary vector with "1" if the covariate is selected and "0" if it is deemed irrelevant. |
tau.estimate |
The estimate of the sparsity level. If user specified |
Ray Bai and Malay Ghosh
Bhattacharya, A., Pati, D., Pillai, N. S., and Dunson, D. B. (2015). "Dirichlet-Laplace priors for optimal shrinkage." Journal of the American Statistical Association, 110(512):1479-1490.
van der Pas, S. L., Kleijn, B. J. K., and van der Vaart, A. W. (2014). "The horseshoe estimator: Posterior concentration around nearly black vectors." Electronic Journal of Statistics, 8(2):2585-2618.
van der Pas, S. L., Szabo, B. T., and van der Vaart, A. (2017). "Adaptive posterior contraction rates for the horseshoe." Electronic Journal of Statistics, 11(2):3196-3225.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | ###################################################
# Example on synthetic data. #
# 5 percent of entries in theta are set to A = 7. #
###################################################
n <- 100
sparsity.level <- 5
A <- 7
# Initialize theta vector of all zeros
theta.true <- rep(0,n)
# Set (sparsity.level) percent of them to be A
q <- floor(n*(sparsity.level/100))
# Pick random indices of theta.true to equal A
signal.indices <- sample(1:n, size=q, replace=FALSE)
###################
# Generate data X #
###################
theta.true[signal.indices] <- A
X <- theta.true + rnorm(n,0,1)
#########################
# Run the DL model on X #
#########################
# For optimal performance, should set max.steps=10,000 and burnin=5000.
# Estimate tau from the empirical Bayes estimate of sparsity level
dl.model <- dl.normalmeans(X, tau.est="est.sparsity", sigma2=1, var.select="threshold",
max.steps=1000, burnin=500)
dl.model$theta.med # Posterior median estimates
dl.model$dl.classifications # Classifications
dl.model$tau.estimate # Estimate of sparsity level
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.