general_sequence_model: Compute marginal posterior estimates

View source: R/SequenceSpikeSlab.R

general_sequence_modelR Documentation

Compute marginal posterior estimates

Description

This function computes marginal posterior probabilities (slab probabilities) that data points have non-zero mean for the general hierarchical prior in the sparse normal sequence model. The posterior mean is also provided.

Usage

general_sequence_model(
  x,
  sigma = 1,
  slab = "Laplace",
  log_prior = "beta-binomial",
  Laplace_lambda = 0.5,
  Cauchy_gamma = 1,
  beta_kappa = 1,
  beta_lambda,
  show_progress = TRUE
)

Arguments

x

Vector of n data points

sigma

Standard deviation of the Gaussian noise in the data. May also be set to "auto", in which case sigma is estimated using the estimateSigma function from the selectiveInference package

slab

Slab distribution. Must be either "Laplace" or "Cauchy".

log_prior

Vector of length n+1 containing the logarithms of the prior probabilities pi_n(s) that the number of spikes is equal to s for s=0,...,n. It is allowed to use an unnormalized prior that does not sum to 1, because adding any constant to the log-prior probabilities does not change the result. Instead of a vector, log_prior may also be set to "beta-binomial" as a short-hand for log_prior = lbeta(beta_kappa+(0:n),beta_lambda+n-(0:n)) - lbeta(beta_kappa,beta_lambda) + lchoose(n,0:n).

Laplace_lambda

Parameter of the Laplace slab

Cauchy_gamma

Parameter of the Cauchy slab

beta_kappa

Parameter of the beta-distribution in the beta-binomial prior

beta_lambda

Parameter of the beta-distribution in the beta-binomial prior. Default value=n+1

show_progress

Boolean that indicates whether to show a progress bar

Details

The run-time is O(n^2) on n data points, which means that doubling the size of the data leads to an increase in computation time by approximately a factor of 4. Data sets of size n=25,000 should be feasible within approximately 30 minutes.

Value

list (postprobs, postmean, sigma), where postprobs is a vector of marginal posterior slab probabilities that x[i] has non-zero mean for i=1,...,n; postmean is a vector with the posterior mean for the x[i]; and sigma is the value of sigma (this may be of interest when the sigma="auto" option is used)

Examples

# Experiments similar to those of Castilo, Van der Vaart, 2012

# Generate data
n <- 500           # sample size
n_signal <- 25     # number of non-zero theta
A <- 5             # signal strength
theta <- c(rep(A,n_signal), rep(0,n-n_signal))
x <- theta + rnorm(n, sd=1)

# Choose slab
slab <- "Laplace"
Laplace_lambda <- 0.5

# Prior 1
kappa1 <- 0.4   # hyperparameter
logprior1 <- c(0,-kappa1*(1:n)*log(n*3/(1:n)))
res1 <- general_sequence_model(x, sigma=1,
                               slab=slab,
                               log_prior=logprior1,
                               Laplace_lambda=Laplace_lambda)
print("Prior 1: Elements with marginal posterior probability >= 0.5:")
print(which(res1$postprobs >= 0.5))

# Prior 2
kappa2 <- 0.8   # hyperparameter
logprior2 <- kappa2*lchoose(2*n-0:n,n)
res2 <- general_sequence_model(x, sigma=1,
                               slab=slab,
                               log_prior=logprior2,
                               Laplace_lambda=Laplace_lambda)
print("Prior 2: Elements with marginal posterior probability >= 0.5:")
print(which(res2$postprobs >= 0.5))

# Prior 3
beta_kappa <- 1      # hyperparameter
beta_lambda <- n+1   # hyperparameter
res3 <- general_sequence_model(x, sigma=1,
                               slab=slab,
                               log_prior="beta-binomial",
                               Laplace_lambda=Laplace_lambda)
print("Prior 3: Elements with marginal posterior probability >= 0.5:")
print(which(res3$postprobs >= 0.5))

# Plot means for all priors
M=max(abs(x))+1
plot(1:n, x, pch=20, ylim=c(-M,M), col='green', xlab="", ylab="", main="Posterior Means")
points(1:n, theta, pch=20, col='blue')
points(1:n, res1$postmean, pch=20, col='black', cex=0.6)
points(1:n, res2$postmean, pch=20, col='magenta', cex=0.6)
points(1:n, res3$postmean, pch=20, col='red', cex=0.6)
legend("topright", legend=c("posterior mean 1", "posterior mean 2", "posterior mean 3",
                            "data", "truth"),
       col=c("black", "magenta", "red", "green", "blue"), pch=20, cex=0.7)

SequenceSpikeSlab documentation built on Sept. 8, 2023, 6:06 p.m.