mvsusie | R Documentation |
Performs a Bayesian multiple linear regression of Y on X. That is, this function fits the regression model
Y = \sum_l X
b_l + e,
where the elements of e
are i.i.d. normal
with zero mean and variance residual_variance
, and the sum
\sum_l b_l
is a vector of p effects to be estimated. The
SuSiE assumption is that each b_l
has exactly one non-zero
element.
mvsusie(
X,
Y,
L = 10,
prior_variance = 0.2,
residual_variance = NULL,
prior_weights = NULL,
standardize = TRUE,
intercept = TRUE,
approximate = FALSE,
estimate_residual_variance = FALSE,
estimate_prior_variance = TRUE,
estimate_prior_method = "EM",
check_null_threshold = 0,
prior_tol = 1e-09,
compute_objective = TRUE,
s_init = NULL,
coverage = 0.95,
min_abs_corr = 0.5,
compute_univariate_zscore = FALSE,
precompute_covariances = FALSE,
n_thread = 1,
max_iter = 100,
tol = 0.001,
verbosity = 2,
track_fit = FALSE
)
mvsusie_rss(
Z,
R,
N,
Bhat,
Shat,
varY,
prior_variance = 0.2,
residual_variance = NULL,
...
)
mvsusie_suff_stat(
XtX,
XtY,
YtY,
N,
L = 10,
X_colmeans = NULL,
Y_colmeans = NULL,
prior_variance = 0.2,
residual_variance = NULL,
prior_weights = NULL,
standardize = TRUE,
estimate_residual_variance = FALSE,
estimate_prior_variance = TRUE,
estimate_prior_method = "EM",
check_null_threshold = 0,
prior_tol = 1e-09,
compute_objective = TRUE,
precompute_covariances = FALSE,
s_init = NULL,
coverage = 0.95,
min_abs_corr = 0.5,
n_thread = 1,
max_iter = 100,
tol = 0.001,
verbosity = 2,
track_fit = FALSE
)
X |
N by J matrix of covariates. |
Y |
Vector of length N, or N by R matrix of response variables. |
L |
Maximum number of non-zero effects. |
prior_variance |
Can be either (1) a vector of length L, or a
scalar, for scaled prior variance when Y is univariate (which
should then be equivalent to |
residual_variance |
The residual variance |
prior_weights |
A vector of length p giving the prior probability that each element is non-zero. Note that the prior weights need to be non-negative but do not need to sum to 1; they will automatically be normalized to sum to 1 so that they represent probabilities. The default setting is that the prior weights are the same for all variables. |
standardize |
Logical flag specifying whether to standardize
columns of X to unit variance prior to fitting. If you do not
standardize you may need to think more carefully about specifying
the scale of the prior variance. Whatever the value of standardize,
the coefficients (returned by |
intercept |
Should intercept be fitted or set to zero. Setting
|
approximate |
Specifies whether to use approximate computation
for the intercept when there are missing values in Y. The
approximation saves some computational effort. Note that when the
residual_variance is a diagonal matrix, running |
estimate_residual_variance |
When
|
estimate_prior_variance |
When |
estimate_prior_method |
The method used for estimating the
prior variance; valid choices are |
check_null_threshold |
When the prior variance is estimated,
the estimate is compared against the null, and the prior variance
is set to zero unless the log-likelihood using the estimate is
larger than that of null by this threshold. For example, setting
|
prior_tol |
When the prior variance is estimated, compare the estimated value to this value at the end of the analysis and exclude a single effect from PIP computation if the estimated prior variance is smaller than it. |
compute_objective |
Add description of "compute_objective" input argument here. |
s_init |
A previous model fit with which to initialize. |
coverage |
Coverage of credible sets. |
min_abs_corr |
Minimum of absolute value of correlation
allowed in a credible set. The setting |
compute_univariate_zscore |
When
|
precompute_covariances |
If |
n_thread |
Maximum number of threads to use for parallel computation (only applicable when a mixture prior is used). |
max_iter |
Maximum number of iterations to perform. |
tol |
The model fitting will terminate when the increase in
ELBOs between two successive iterations is less than |
verbosity |
Set |
track_fit |
Add attribute |
Z |
J x R matrix of z-scores. |
R |
J x J LD matrix. |
N |
The sample size. |
Bhat |
Alternative summary data giving the estimated effects
(J X R matrix). This, together with |
Shat |
Alternative summary data giving the standard errors of
the estimated effects (J X R matrix). This, together with
|
varY |
The sample covariance of Y, defined as |
... |
Additional arguments passed to
|
XtX |
A J x J matrix |
XtY |
A J x R matrix |
YtY |
An R x R matrix |
X_colmeans |
A vector of length J giving the column means of
|
Y_colmeans |
A vector of length R giving the column means of
|
A multivariate susie fit, which is a list with some or all of the following elements:
alpha |
L by p matrix of posterior inclusion probabilites. |
b1 |
L by p matrix of posterior mean single-effect estimates. |
b1_rescaled |
L by p matrix |
of posterior mean single-effect
estimates on the original input scale (same as coef
).
b2 |
L by p matrix of posterior second moments. |
KL |
Vector of single-effect KL divergences. |
lbf |
Vector of single-effect log-Bayes factors. |
sigma2 |
Residual variance. |
V |
Prior variance. |
elbo |
Vector storing the the evidence lower bound, or “ELBO”, achieved at each iteration of the model fitting algorithm, which attempts to maximize the ELBO. |
niter |
Number of iterations performed. |
convergence |
Convergence status. |
sets |
Estimated credible sets. |
pip |
Vector of posterior inclusion probabilities. |
walltime |
Records runtime of the model fitting algorithm. |
z |
Vector of univariate z-scores. |
single_effect_lfsr |
Average lfsr (local false sign rate) for each CS. |
lfsr |
TO DO: Explain what this output is. |
conditional_lfsr |
The lfsr (local false sign rate) given that the variable is the single effect. |
# Example with one response.
set.seed(1)
n <- 2000
p <- 1000
beta <- rep(0, p)
beta[1:4] <- 1
X <- matrix(rnorm(n * p), n, p)
Y <- X %*% beta + rnorm(n)
fit <- mvsusie(X, Y, L = 10)
# Sufficient statistics example with one response.
X_colmeans <- colMeans(X)
Y_colmeans <- colMeans(Y)
X <- scale(X, center = TRUE, scale = FALSE)
Y <- scale(Y, center = TRUE, scale = FALSE)
XtX <- crossprod(X)
XtY <- crossprod(X, Y)
YtY <- crossprod(Y)
res <- mvsusie_suff_stat(XtX, XtY, YtY, n, L = 10, X_colmeans, Y_colmeans)
# RSS example with one response.
R <- crossprod(X)
z <- susieR:::calc_z(X, Y)
res <- mvsusie_rss(z, R, N = n, L = 10)
# Example with three responses.
set.seed(1)
n <- 500
p <- 1000
true_eff <- 2
X <- sample(c(0, 1, 2), size = n * p, replace = TRUE)
X <- matrix(X, n, p)
beta1 <- rep(0, p)
beta2 <- rep(0, p)
beta3 <- rep(0, p)
beta1[1:true_eff] <- runif(true_eff)
beta2[1:true_eff] <- runif(true_eff)
beta3[1:true_eff] <- runif(true_eff)
y1 <- X %*% beta1 + rnorm(n)
y2 <- X %*% beta2 + rnorm(n)
y3 <- X %*% beta3 + rnorm(n)
Y <- cbind(y1, y2, y3)
prior <- create_mixture_prior(R = 3)
fit <- mvsusie(X, Y, prior_variance = prior)
# Sufficient statistics example with three responses.
X_colmeans <- colMeans(X)
Y_colmeans <- colMeans(Y)
X <- scale(X, center = TRUE, scale = FALSE)
Y <- scale(Y, center = TRUE, scale = FALSE)
XtX <- crossprod(X)
XtY <- crossprod(X, Y)
YtY <- crossprod(Y)
res <- mvsusie_suff_stat(XtX, XtY, YtY, n,
L = 10, X_colmeans, Y_colmeans,
prior_variance = prior
)
# RSS example with three responses.
R <- crossprod(X)
Z <- susieR:::calc_z(X, Y)
res <- mvsusie_rss(Z, R, N = n, L = 10, prior_variance = prior)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.