Description Usage Arguments Details Value References See Also Examples

Variational approximation procedure fitting sparse multivariate regression models for combined selection of predictors and associated responses in high-dimensional set-ups. Dependence across responses linked to the same predictors is captured through the model hierarchical structure. The responses can be purely continuous, purely binary (logit or probit link fits), or a mix of continuous and binary variables.

1 2 3 4 |

`Y` |
Response data matrix of dimension n x d, where n is the number of observations and d is the number of response variables. |

`X` |
Input matrix of dimension n x p, where p is the number of candidate
predictors. |

`p0_av` |
Prior average number of predictors expected to be included in
the model. Must be |

`Z` |
Covariate matrix of dimension n x q, where q is the number of
covariates. Variables in |

`V` |
Annotation matrix of dimension p x r, where r is the number of
variables representing external information on the candidate predictors
which may make their selection more or less likely. |

`link` |
Response link. Must be " |

`ind_bin` |
If |

`list_hyper` |
An object of class " |

`list_init` |
An object of class " |

`list_cv` |
An object of class " |

`list_blocks` |
An object of class " |

`user_seed` |
Seed set for reproducible default choices of hyperparameters
(if |

`tol` |
Tolerance for the stopping criterion. |

`maxit` |
Maximum number of iterations allowed. |

`save_hyper` |
If |

`save_init` |
If |

`verbose` |
If |

The optimization is made using efficient block coordinate ascent schemes, for which convergence is ensured as the objective (elbo) is multiconcave for the selected blocks, i.e., it is concave in each block of parameters whose updates are made simultaneously, see Wu et al. (reference Section below).

The continuous response variables in `Y`

(if any) will be centered
before application of the variational algorithm, and the candidate predictors
and covariates resp. in `X`

and `Z`

will be standardized. An
intercept will be added if `link`

is `"logit"`

, `"probit"`

or
`"mix"`

(do not supply it in `X`

or `Z`

).

An object of class "`vb`

" containing the following variational
estimates and settings:

`gam_vb` |
Posterior inclusion probability matrix of dimension p x d. Entry (s, t) corresponds to the posterior probability of association between candidate predictor s and response t. |

`mu_alpha_vb` |
Matrix of dimension q x d whose entries are the posterior
mean regression coefficients for the covariates provided
in |

`mu_c0_vb` |
Vector of length p whose entries are the posterior intercept
coefficients at the level of probabilities of associations.
Entry s represents the control of the proportion of
responses associated with candidate predictor s which is not
due to external annotations. |

`mu_c_vb` |
Matrix of dimension r x d. Entry (l, k) contains the effect
of annotation l for response k on the probabilities of
associations. |

`om_vb` |
Vector of length p containing the posterior mean of omega.
Entry s controls the proportion of responses associated with
candidate predictor s. NULL if |

`converged` |
A boolean indicating whether the algorithm has converged
before reaching |

`it` |
Final number of iterations. |

`lb_opt` |
Optimized variational lower bound for the marginal log-likelihood (ELBO). |

`diff_lb` |
Difference in ELBO between the last and penultimate
iterations. This may be a useful diagnostic information when
convergence has not been reached before |

`p_star` |
Vector of length 1 or p defining the applied sparsity control. |

`rmvd_cst_x, rmvd_cst_z` |
Vectors containing the indices of constant
variables in |

`rmvd_coll_x, rmvd_coll_z` |
Vectors containing the indices of variables
in |

`list_hyper, list_init` |
If |

H. Ruffieux, A. C. Davison, J. Hager, I. Irincheeva. Efficient inference for genetic association studies with multiple outcomes. Biostatistics, 2017.

Y. Xu, and W. Yin. A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on imaging sciences, 6, pp.1758-1789, 2013.

`set_hyper`

, `set_init`

,
`set_cv`

, `set_blocks`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | ```
seed <- 123; set.seed(seed)
###################
## Simulate data ##
###################
## Examples using small problem sizes:
##
n <- 200; p <- 250; p0 <- 25; d <- 30; d0 <- 25; q <- 3; r <- 3
## Candidate predictors (subject to selection)
##
# Here we simulate common genetic variants (but any type of candidate
# predictors can be supplied).
# 0 = homozygous, major allele, 1 = heterozygous, 2 = homozygous, minor allele
#
X_act <- matrix(rbinom(n * p0, size = 2, p = 0.25), nrow = n)
X_inact <- matrix(rbinom(n * (p - p0), size = 2, p = 0.25), nrow = n)
shuff_x_ind <- sample(p)
X <- cbind(X_act, X_inact)[, shuff_x_ind]
bool_x_act <- shuff_x_ind <= p0
pat_act <- beta <- matrix(0, nrow = p0, ncol = d0)
pat_act[sample(p0*d0, floor(p0*d0/5))] <- 1
beta[as.logical(pat_act)] <- rnorm(sum(pat_act))
## Covariates (not subject to selection)
##
Z <- matrix(rnorm(n * q), nrow = n)
alpha <- matrix(rnorm(q * d), nrow = q)
## Gaussian responses
##
Y_act <- matrix(rnorm(n * d0, mean = X_act %*% beta, sd = 0.5), nrow = n)
Y_inact <- matrix(rnorm(n * (d - d0), sd = 0.5), nrow = n)
shuff_y_ind <- sample(d)
Y <- cbind(Y_act, Y_inact)[, shuff_y_ind] + Z %*% alpha
## Binary responses
##
Y_bin <- ifelse(Y > 0, 1, 0)
## Informative annotation variables
##
V <- matrix(rnorm(p * r), nrow = p)
V[bool_x_act, ] <- rnorm(p0 * r, mean = 2)
########################
## Infer associations ##
########################
## Continuous responses
##
# We take p0_av = p0 (known here); this choice may, in some cases, result in
# (too) conservative variable selections. In practice, it is advised to set
# p0_av as a slightly overestimated guess of p0, or perform cross-validation
# using function `set_cv'.
# No covariate
#
vb_g <- locus(Y = Y, X = X, p0_av = p0, link = "identity", user_seed = seed)
# With covariates
#
vb_g_z <- locus(Y = Y, X = X, p0_av = p0, Z = Z, link = "identity",
user_seed = seed)
# With external annotation variables
#
vb_g_v <- locus(Y = Y, X = X, p0_av = p0, Z = Z, V = V, link = "identity",
user_seed = seed)
## Binary responses
##
vb_logit <- locus(Y = Y_bin, X = X, p0_av = p0, Z = Z, link = "logit",
user_seed = seed)
vb_probit <- locus(Y = Y_bin, X = X, p0_av = p0, Z = Z, link = "probit",
user_seed = seed)
## Mix of continuous and binary responses
##
Y_mix <- cbind(Y, Y_bin)
ind_bin <- (d+1):(2*d)
vb_mix <- locus(Y = Y_mix, X = X, p0_av = p0, Z = Z, link = "mix",
ind_bin = ind_bin, user_seed = seed)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.