atlasqtl | R Documentation |
Flexible sparse regression for hierarchically-related responses using annealed variational inference.
atlasqtl(
Y,
X,
p0,
anneal = c(1, 2, 10),
tol = 0.1,
maxit = 1000,
user_seed = NULL,
verbose = 1,
list_hyper = NULL,
list_init = NULL,
save_hyper = FALSE,
save_init = FALSE,
full_output = FALSE,
thinned_elbo_eval = TRUE,
checkpoint_path = NULL,
trace_path = NULL,
add_collinear_back = FALSE
)
Y |
Response data matrix of dimension n x q, where n is the number of samples and q is the number of response variables. |
X |
Input matrix of dimension n x p, where p is the number of candidate
predictors. |
p0 |
Vector of size 2 whose entries are the prior expectation and
variance of the number of predictors associated with each response.
Must be |
anneal |
Parameters for annealing scheme. Must be a vector whose first
entry is the type of schedule: 1 = geometric spacing (default),
2 = harmonic spacing or 3 = linear spacing, the second entry is the initial
temperature (default is 2), and the third entry is the temperature grid
size (default is 10). If |
tol |
Tolerance for the stopping criterion (default is 0.1). |
maxit |
Maximum number of iterations allowed (default is 1000). |
user_seed |
Seed set for reproducible default choices of hyperparameters
(if |
verbose |
Integer specifying the level of verbosity during execution: 0, for no message, 1 for standard verbosity (default), 2 for detailed verbosity. |
list_hyper |
An object of class " |
list_init |
An object of class " |
save_hyper |
If |
save_init |
If |
full_output |
If |
thinned_elbo_eval |
If |
checkpoint_path |
Path where to save temporary checkpoint outputs.
Default is |
trace_path |
Path where to save trace plot for the variance of hotspot
propensities. Default is |
add_collinear_back |
Boolean for re-adding the collinear variables X
(removed as part of |
atlasqtl
implements a flexible hierarchical regression
framework that allows information-sharing across responses and predictors,
thereby enhancing the detection of weak effects. atlasqtl
is
tailored to the detection of hotspot predictors, i.e., predictors
associated with many responses: it is based on a fully Bayesian model
whereby the hotspot propensity is modelled using the horseshoe shrinkage
prior: its global-local formulation shrinks noise globally and hence can
accommodate highly sparse settings, while being robust to individual
signals, thus leaving the effects of hotspots unshrunk. Inference is
carried out using a scalable variational algorithm coupled with a novel
simulated annealing procedure, which is applicable to large dimensions,
i.e., thousands of response variables, and allows efficient exploration of
multimodal distributions.
The columns of the response matrix Y
are centered within the
atlasqtl
call, and the columns of the candidate predictor matrix
X
are standardized.
An object of class "atlasqtl
" containing the following
variational estimates and settings:
beta_vb |
Estimated effect size matrix of dimension p x q. Entry (s, t) corresponds to the variational posterior mean (mu_beta_vb_st x gam_vb_st) of the regression effect between candidate predictor s and response t. |
gam_vb |
Posterior inclusion probability matrix of dimension p x q. Entry (s, t) corresponds to the variational posterior probability of association between candidate predictor s and response t. |
theta_vb |
Vector of length p containing the variational posterior mean of the hotspot propensity. Entry s corresponds to the propensity of candidate predictor s to be associated with many responses. |
X_beta_vb |
Matrix of dimension n x q equal to the matrix product X beta_vb, where X is the rescaled predictor matrix with possible collinear predictor(s) excluded. |
zeta_vb |
Vector of length q containing the variational posterior mean of the response importance. Entry t corresponds to the propensity of response t have associated predictors. |
converged |
A boolean indicating whether the algorithm has converged
before reaching |
it |
Final number of iterations. |
lb_opt |
Optimized variational lower bound (ELBO) on the marginal log-likelihood. |
diff_lb |
Difference in ELBO between the last and penultimate
iterations (to be used as a convergence diagnostic when
|
rmvd_cst_x |
Vectors containing the indices of constant variables in
|
rmvd_coll_x |
Vectors containing the indices of variables in |
list_hyper , list_init |
If |
... |
If |
Helene Ruffieux, Anthony C. Davison, Jorg Hager, Jamie Inshaw, Benjamin P. Fairfax, Sylvia Richardson, Leonardo Bottolo, A global-local approach for detecting hotspots in multiple-response regression, arXiv:1811.03334, 2018.
set_hyper
, set_init
.
seed <- 123; set.seed(seed)
###################
## Simulate data ##
###################
# Example with small problem sizes:
#
n <- 200; p <- 50; p_act <- 10; q <- 100; q_act <- 50
# Candidate predictors (subject to selection)
#
# Here example with common genetic variants under Hardy-Weinberg equilibrium
#
X_act <- matrix(rbinom(n * p_act, size = 2, p = 0.25), nrow = n)
X_inact <- matrix(rbinom(n * (p - p_act), size = 2, p = 0.25), nrow = n)
# shuffle indices
shuff_x_ind <- sample(p)
shuff_y_ind <- sample(q)
X <- cbind(X_act, X_inact)[, shuff_x_ind]
# Association pattern and effect sizes
#
pat <- matrix(FALSE, ncol = q, nrow = p)
bool_x <- shuff_x_ind <= p_act
bool_y <- shuff_y_ind <= q_act
pat_act <- beta_act <- matrix(0, nrow = p_act, ncol = q_act)
pat_act[sample(p_act * q_act, floor(p_act * q_act / 5))] <- 1
beta_act[as.logical(pat_act)] <- rnorm(sum(pat_act))
pat[bool_x, bool_y] <- pat_act
# Gaussian responses
#
Y_act <- matrix(rnorm(n * q_act, mean = X_act %*% beta_act), nrow = n)
Y_inact <- matrix(rnorm(n * (q - q_act)), nrow = n)
Y <- cbind(Y_act, Y_inact)[, shuff_y_ind]
########################
## Infer associations ##
########################
# Expectation and variance for the prior number of predictors associated with
# each response
#
p0 <- c(mean(colSums(pat)), 10)
res_atlas <- atlasqtl(Y = Y, X = X, p0 = p0, user_seed = seed)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.