hsplus.normalmeans: Normal Means Estimation and Hypothesis Testing with the...

Description Usage Arguments Details Value Author(s) References Examples

Description

This function implements the horseshoe+ model of Bhadra et al. (2017) for obtaining a sparse estimate of θ = (θ_1, ..., θ_n) in the normal means problem,

X_i = θ_i + ε_i,

where ε_i \sim N(0, σ^2). This is achieved by placing the horseshoe+ (HS+) prior on the individual θ_i's. The sparsity parameter τ can be specified a priori, or it can be estimated from the data, either by: 1) using the estimate of sparsity level by van der Pas et al. (2014), 2) by taking a restricted marginal maximum likelihood (REML) estimate on [1/n, 1], 3) endowing τ with a uniform prior, U(1/n, 1), or 4) endowing τ with a standard Cauchy prior truncated to [1/n, 1]. Multiple testing can also be performed by either thresholding the shrinkage factor in the posterior mean, or by examining the marginal 95 percent credible intervals.

Usage

1
2
3
4
hsplus.normalmeans(x, tau.est=c("fixed", "est.sparsity", "reml", "uniform", 
                   "truncatedCauchy"), tau=1/length(x), sigma2=1, 
                   var.select=c("threshold", "intervals"), 
                   max.steps = 10000, burnin=5000)

Arguments

x

an n \times 1 multivariate normal vector.

tau.est

The method for estimating the sparsity parameter τ. If "fixed" is chosen, then τ is not estimated from the data. If "est.sparsity" is chosen, the empirical Bayes estimate of sparsity level from van der Pas et al. (2014) is used. If "reml" is chosen, τ is estimated from restricted marginal maximum likelihood on [1/n, 1]. If "uniform" is chosen, τ is estimated with a uniform prior on [1/n, 1]. If "truncatedCauchy" is chosen, τ is estimated with a standard Cauchy prior truncated to [1/n, 1].

tau

The global parameter τ in the HS+ prior. Controls the sparsity of the model. Defaults to 1/n. User may specify a different value for tau (τ > 0). This is ignored if the method "est.sparsity", "reml", "unif", or "truncatedCauchy" is used to estimate τ.

sigma2

The variance parameter. Defaults to 1. User needs to specify the noise parameter if it is different from 1 (σ^2 > 0).

var.select

The method of variable selection. "threshold" selects variables by thresholding the shrinkage factor in the posterior mean. "intervals" will classify θ_i's as either signals (θ_i \neq 0) or as noise (θ_i = 0) by examining the 95 percent posterior credible intervals.

max.steps

The total number of iterations to run in the Gibbs sampler. Defaults to 10,000.

burnin

The number of burn-in iterations for the Gibbs sampler. Defaults to 5,000.

Details

The function implements sparse estimation and multiple hypothesis testing on a multivariate normal mean vector, θ = (θ_1, …, θ_n) with the horseshoe+ prior of Bhadra et al. (2015). The full model is:

X | θ \sim N_n( θ, σ^2 I_n),

θ_i | λ_i \sim N(0, σ^2 λ_i^2), i = 1, …, n,

λ_i \sim C^{+}(0, τ η_i), i = 1, …, n,

η_i \sim C^{+}(0, 1), i = 1, …, n.

τ is the global parameter that controls the sparsity of the solution. It can be estimated by: the empirical Bayes estimate of the estimated sparsity ("est.sparsity") given in van der Pas et al. (2014), restricted marginal maximum likelihood in the interval [1/n, 1] ("reml"), a uniform prior τ \sim U(1/n, 1) ("uniform"), or by a standard Cauchy prior truncated to [1/n, 1] ("truncatedCauchy").

The posterior mean is of the form [E(1-κ_i | X_1, …, X_n)] X_i, i = 1, …, n. The "threshold" method for variable selection is to classify θ_i as signal (θ_i \neq 0) if E(1 - κ_i | X_1, …, X_n) > 1/2.

Value

The function returns a list containing the following components:

theta.hat

The posterior mean of θ.

theta.med

The posterior median of θ.

theta.var

The posterior variance estimates for each θ_i, i=1,…,p.

theta.intervals

The 95 percent credible intervals for all n components of θ.

hsplus.classifications

An n-dimensional binary vector with "1" if the covariate is selected and "0" if it is deemed irrelevant.

tau.estimate

The estimate of the sparsity level. If user specified "fixed" for tau.est, then it simply returns the fixed τ. If user specified "uniform" or "truncatedCauchy", it returns the posterior mean of π(τ |X_1, …, X_n).

Author(s)

Ray Bai and Malay Ghosh

References

Bhadra, A., Datta, J., Polson, N. G., and Willard, B. (2017). "The horseshoe+ estimator for ultra-sparse signals." Bayesian Analysis, 12(4):1105-1131.

Makalic, E., Schmidt, D. F., and Hopper, J. L. (2016). "Bayesian robust regression with the horseshoe+ estimator." AI 2016: Advances in Artificial Intelligence, 429-440.

van der Pas, S. L., Kleijn, B. J. K., and van der Vaart, A. W. (2014). "The horseshoe estimator: posterior concentration around nearly black vectors." Electronic Journal of Statistics, 8(2):2585-2618.

van der Pas, S. L., Szabo, B. T., and van der Vaart, A. (2017). "Adaptive posterior contraction rates for the horseshoe." Electronic Journal of Statistics, 11(2):3196-3225.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
###################################################
# Example on synthetic data.                      # 
# 5 percent of entries in a theta are set to A=7. #
###################################################
n <- 100
sparsity.level <- 5
A <- 7
# Initialize theta vector of all zeros
theta.true <- rep(0,n)
# Set (sparsity.level) percent of them to be A
q <- floor(n*(sparsity.level/100))
# Pick random indices of theta.true to equal A
signal.indices <- sample(1:n, size=q, replace=FALSE)

###################
# Generate data X #
###################
theta.true[signal.indices] <- A
X <- theta.true + rnorm(n,0,1)

#################################
# Run the horseshoe+ model on X #
#################################
# For optimal performance, should set max.steps=10,000 and burnin=5000.

# Estimate tau using truncated Cauchy prior
hsplus.model <- hsplus.normalmeans(X, tau.est="truncatedCauchy", tau=1/length(X), 
                                  sigma2=1, var.select="threshold", 
                                  max.steps=1000, burnin=500)

hsplus.model$theta.intervals           # 95 percent credible intervals
hsplus.model$hsplus.classifications    # Classifications
hsplus.model$tau.estimate              # Estimate of sparsity level

NormalBetaPrime documentation built on May 2, 2019, 1:39 p.m.