atlasqtl: Flexible sparse regression for hierarchically-related...
In hruffieux/atlasqtl: Flexible sparse regression for hierarchically-related responses using annealed variational inference

atlasqtl

R Documentation

Flexible sparse regression for hierarchically-related responses using annealed variational inference.

Description

Flexible sparse regression for hierarchically-related responses using annealed variational inference.

Usage

atlasqtl(
  Y,
  X,
  p0,
  anneal = c(1, 2, 10),
  tol = 0.1,
  maxit = 1000,
  user_seed = NULL,
  verbose = 1,
  list_hyper = NULL,
  list_init = NULL,
  save_hyper = FALSE,
  save_init = FALSE,
  full_output = FALSE,
  thinned_elbo_eval = TRUE,
  checkpoint_path = NULL,
  trace_path = NULL,
  add_collinear_back = FALSE
)

Arguments

`Y`	Response data matrix of dimension n x q, where n is the number of samples and q is the number of response variables.
`X`	Input matrix of dimension n x p, where p is the number of candidate predictors. `X` cannot contain NAs. No intercept must be supplied.
`p0`	Vector of size 2 whose entries are the prior expectation and variance of the number of predictors associated with each response. Must be `NULL` if `list_init` and `list_hyper` are both non-`NULL`.
`anneal`	Parameters for annealing scheme. Must be a vector whose first entry is the type of schedule: 1 = geometric spacing (default), 2 = harmonic spacing or 3 = linear spacing, the second entry is the initial temperature (default is 2), and the third entry is the temperature grid size (default is 10). If `NULL`, no annealing is performed.
`tol`	Tolerance for the stopping criterion (default is 0.1).
`maxit`	Maximum number of iterations allowed (default is 1000).
`user_seed`	Seed set for reproducible default choices of hyperparameters (if `list_hyper` is `NULL`) and initial variational parameters (if `list_init` is `NULL`). Default is `NULL`, no seed set.
`verbose`	Integer specifying the level of verbosity during execution: 0, for no message, 1 for standard verbosity (default), 2 for detailed verbosity.
`list_hyper`	An object of class "`hyper`" containing the model hyperparameters. Must be specified using the `set_hyper` function or set to `NULL` for default hyperparameters.
`list_init`	An object of class "`init`" containing the initial variational parameters. Must be specified using the `set_init` function or set to `NULL` for a default initialization.
`save_hyper`	If `TRUE`, the hyperparameters used for the model are returned.
`save_init`	If `TRUE`, the initial variational parameters used for the inference are returned (note that the size of the resulting objects is likely to be large). Default is `FALSE`.
`full_output`	If `TRUE`, the inferred variational parameters for all parameters are returned.
`thinned_elbo_eval`	If `TRUE`, the convergence criterion (variational lower bound on the marginal log-likelihood, or ELBO) is not evaluated at each iteration, in order to save computational resources, but after a given batch of iterations where the batch size is set in an adaptive fashion accounting for the number of iterations roughly expected until convergence.
`checkpoint_path`	Path where to save temporary checkpoint outputs. Default is `NULL`, for no checkpointing.
`trace_path`	Path where to save trace plot for the variance of hotspot propensities. Default is `NULL`, for no trace saved.
`add_collinear_back`	Boolean for re-adding the collinear variables X (removed as part of `atlasqtl` to ensure the stability of inference) to the final posterior summaries. Default is `FALSE`.

Details

atlasqtl implements a flexible hierarchical regression framework that allows information-sharing across responses and predictors, thereby enhancing the detection of weak effects. atlasqtl is tailored to the detection of hotspot predictors, i.e., predictors associated with many responses: it is based on a fully Bayesian model whereby the hotspot propensity is modelled using the horseshoe shrinkage prior: its global-local formulation shrinks noise globally and hence can accommodate highly sparse settings, while being robust to individual signals, thus leaving the effects of hotspots unshrunk. Inference is carried out using a scalable variational algorithm coupled with a novel simulated annealing procedure, which is applicable to large dimensions, i.e., thousands of response variables, and allows efficient exploration of multimodal distributions.

The columns of the response matrix Y are centered within the atlasqtl call, and the columns of the candidate predictor matrix X are standardized.

Value

An object of class "atlasqtl" containing the following variational estimates and settings:

`beta_vb`	Estimated effect size matrix of dimension p x q. Entry (s, t) corresponds to the variational posterior mean (mu_beta_vb_st x gam_vb_st) of the regression effect between candidate predictor s and response t.
`gam_vb`	Posterior inclusion probability matrix of dimension p x q. Entry (s, t) corresponds to the variational posterior probability of association between candidate predictor s and response t.
`theta_vb`	Vector of length p containing the variational posterior mean of the hotspot propensity. Entry s corresponds to the propensity of candidate predictor s to be associated with many responses.
`X_beta_vb`	Matrix of dimension n x q equal to the matrix product X beta_vb, where X is the rescaled predictor matrix with possible collinear predictor(s) excluded.
`zeta_vb`	Vector of length q containing the variational posterior mean of the response importance. Entry t corresponds to the propensity of response t have associated predictors.
`converged`	A boolean indicating whether the algorithm has converged before reaching `maxit` iterations.
`it`	Final number of iterations.
`lb_opt`	Optimized variational lower bound (ELBO) on the marginal log-likelihood.
`diff_lb`	Difference in ELBO between the last and penultimate iterations (to be used as a convergence diagnostic when `maxit` has been reached).
`rmvd_cst_x`	Vectors containing the indices of constant variables in `X` removed prior to the analysis.
`rmvd_coll_x`	Vectors containing the indices of variables in `X` removed prior to the analysis because collinear to other variables. The entry name indicates the corresponding variable kept in the analysis.
`list_hyper`, `list_init`	If `save_hyper`, resp. `save_init`, are `TRUE`, the hyperparameters, resp. initial variational parameters, used for inference are saved as output.
`...`	If `full_output` is `TRUE` all inferred variational parameters are returned.

References

Helene Ruffieux, Anthony C. Davison, Jorg Hager, Jamie Inshaw, Benjamin P. Fairfax, Sylvia Richardson, Leonardo Bottolo, A global-local approach for detecting hotspots in multiple-response regression, arXiv:1811.03334, 2018.

Examples

seed <- 123; set.seed(seed)

###################
## Simulate data ##
###################

# Example with small problem sizes:
#
n <- 200; p <- 50; p_act <- 10; q <- 100; q_act <- 50

# Candidate predictors (subject to selection)
#
# Here example with common genetic variants under Hardy-Weinberg equilibrium
#
X_act <- matrix(rbinom(n * p_act, size = 2, p = 0.25), nrow = n)
X_inact <- matrix(rbinom(n * (p - p_act), size = 2, p = 0.25), nrow = n)

# shuffle indices 
shuff_x_ind <- sample(p)
shuff_y_ind <- sample(q)

X <- cbind(X_act, X_inact)[, shuff_x_ind]

# Association pattern and effect sizes
#
pat <- matrix(FALSE, ncol = q, nrow = p)
bool_x <- shuff_x_ind <= p_act
bool_y <- shuff_y_ind <= q_act

pat_act <- beta_act <- matrix(0, nrow = p_act, ncol = q_act)
pat_act[sample(p_act * q_act, floor(p_act * q_act / 5))] <- 1
beta_act[as.logical(pat_act)] <-  rnorm(sum(pat_act))

pat[bool_x, bool_y] <- pat_act

# Gaussian responses
#
Y_act <- matrix(rnorm(n * q_act, mean = X_act %*% beta_act), nrow = n)
Y_inact <- matrix(rnorm(n * (q - q_act)), nrow = n)

Y <- cbind(Y_act, Y_inact)[, shuff_y_ind]

########################
## Infer associations ##
########################

# Expectation and variance for the prior number of predictors associated with
# each response
#
p0 <- c(mean(colSums(pat)), 10)

res_atlas <- atlasqtl(Y = Y, X = X, p0 = p0, user_seed = seed)

hruffieux/atlasqtl documentation built on April 12, 2025, 12:54 p.m.

hruffieux/atlasqtl index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

hruffieux/atlasqtl
Flexible sparse regression for hierarchically-related responses using annealed variational inference

atlasqtl: Flexible sparse regression for hierarchically-related...
In hruffieux/atlasqtl: Flexible sparse regression for hierarchically-related responses using annealed variational inference

Flexible sparse regression for hierarchically-related responses using annealed variational inference.

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to atlasqtl in hruffieux/atlasqtl...

R Package Documentation

Browse R Packages

We want your feedback!

hruffieux/atlasqtl Flexible sparse regression for hierarchically-related responses using annealed variational inference

atlasqtl: Flexible sparse regression for hierarchically-related... In hruffieux/atlasqtl: Flexible sparse regression for hierarchically-related responses using annealed variational inference

Flexible sparse regression for hierarchically-related responses using annealed variational inference.

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to atlasqtl in hruffieux/atlasqtl...

R Package Documentation

Browse R Packages

We want your feedback!

hruffieux/atlasqtl
Flexible sparse regression for hierarchically-related responses using annealed variational inference

atlasqtl: Flexible sparse regression for hierarchically-related...
In hruffieux/atlasqtl: Flexible sparse regression for hierarchically-related responses using annealed variational inference