susie_rss  R Documentation 
susie_rss
performs variable selection under a
sparse Bayesian multiple linear regression of Y on X
using the zscores from standard univariate regression
of Y on each column of X, an estimate, R, of
the correlation matrix for the columns of X, and optionally,
but strongly recommended, the sample size n. See
“Details” for other ways to call susie_rss
susie_rss( z, R, n, bhat, shat, var_y, z_ld_weight = 0, estimate_residual_variance = FALSE, prior_variance = 50, check_prior = TRUE, ... )
z 
pvector of zscores. 
R 
p x p correlation matrix. 
n 
The sample size. 
bhat 
Alternative summary data giving the estimated effects
(a vector of length p). This, together with 
shat 
Alternative summary data giving the standard errors of
the estimated effects (a vector of length p). This, together with

var_y 
The sample variance of y, defined as y'y/(n1).
When the sample variance is not provided, the coefficients
(returned from 
z_ld_weight 
This parameter is included for backwards
compatibility with previous versions of the function, but it is no
longer recommended to set this to a nonzero value. When

estimate_residual_variance 
The default is FALSE, the
residual variance is fixed to 1 or variance of y. If the insample
LD matrix is provided, we recommend setting

prior_variance 
The prior variance(s) for the nonzero
noncentrality parameterss \tilde{b}_l. It is either a scalar,
or a vector of length L. When the 
check_prior 
When 
... 
Other parameters to be passed to

In some applications, particularly genetic applications,
it is desired to fit a regression model (Y = Xb + E say,
which we refer to as "the original regression model" or ORM)
without access to the actual values of Y and X, but
given only some summary statistics. susie_rss
assumes
availability of zscores from standard univariate regression of
Y on each column of X, and an estimate, R, of the
correlation matrix for the columns of X (in genetic
applications R is sometimes called the “LD matrix”).
With the inputs z
, R
and sample size n
,
susie_rss
computes PVEadjusted zscores z_tilde
, and
calls susie_suff_stat
with XtX = (n1)R
, Xty =
√{n1} z_tilde, yty = n1
, n = n
. The
output effect estimates are on the scale of b in the ORM with
standardized X and y. When the LD matrix
R
and the zscores z
are computed using the same
matrix X, the results from susie_rss
are same as, or
very similar to, susie
with standardized X and
y.
Alternatively, if the user provides n
, bhat
(the
univariate OLS estimates from regressing y on each column of
X), shat
(the standard errors from these OLS
regressions), the insample correlation matrix R =
cov2cor(crossprod(X)), and the variance of y, the results
from susie_rss
are same as susie
with X and
y. The effect estimates are on the same scale as the
coefficients b in the ORM with X and y.
In rare cases in which the sample size, n, is unknown,
susie_rss
calls susie_suff_stat
with XtX = R
and Xty = z
, and with residual_variance = 1
. The
underlying assumption of performing the analysis in this way is
that the sample size is large (i.e., infinity), and/or the
effects are small. More formally, this combines the loglikelihood
for the noncentrality parameters, \tilde{b} = √{n} b,
L(\tilde{b}; z, R) = (\tilde{b}'R\tilde{b}  2z'\tilde{b})/2,
with the “susie prior” on
\tilde{b}; see susie
and Wang et al
(2020) for details. In this case, the effect estimates returned by
susie_rss
are on the noncentrality parameter scale.
The estimate_residual_variance
setting is FALSE
by
default, which is recommended when the LD matrix is estimated from
a reference panel. When the LD matrix R
and the summary
statistics z
(or bhat
, shat
) are computed
using the same matrix X, we recommend setting
estimate_residual_variance = TRUE
.
A "susie"
object with the following
elements:
alpha 
An L by p matrix of posterior inclusion probabilites. 
mu 
An L by p matrix of posterior means, conditional on inclusion. 
mu2 
An L by p matrix of posterior second moments, conditional on inclusion. 
lbf 
logBayes Factor for each single effect. 
lbf_variable 
logBayes Factor for each variable and single effect. 
V 
Prior variance of the nonzero elements of b. 
elbo 
The value of the variational lower bound, or “ELBO” (objective function to be maximized), achieved at each iteration of the IBSS fitting procedure. 
sets 
Credible sets estimated from model fit; see

pip 
A vector of length p giving the (marginal) posterior inclusion probabilities for all p covariates. 
niter 
Number of IBSS iterations that were performed. 
converged 

G. Wang, A. Sarkar, P. Carbonetto and M. Stephens (2020). A simple new approach to variable selection in regression, with application to genetic finemapping. Journal of the Royal Statistical Society, Series B 82, 12731300 doi: 10.1101/501114.
Y. Zou, P. Carbonetto, G. Wang, G and M. Stephens (2022). Finemapping from summary data with the “Sum of Single Effects” model. PLoS Genetics 18, e1010299. doi: 10.1371/journal.pgen.1010299.
set.seed(1) n = 1000 p = 1000 beta = rep(0,p) beta[1:4] = 1 X = matrix(rnorm(n*p),nrow = n,ncol = p) X = scale(X,center = TRUE,scale = TRUE) y = drop(X %*% beta + rnorm(n)) input_ss = compute_suff_stat(X,y,standardize = TRUE) ss = univariate_regression(X,y) R = with(input_ss,cov2cor(XtX)) zhat = with(ss,betahat/sebetahat) res = susie_rss(zhat,R, n=n) # Toy example illustrating behaviour susie_rss when the zscores # are mostly consistent with a noninvertible correlation matrix. # Here the CS should contain both variables, and two PIPs should # be nearly the same. z = c(6,6.01) R = matrix(1,2,2) fit = susie_rss(z,R) print(fit$sets$cs) print(fit$pip) # In this second toy example, the only difference is that one # zscore is much larger than the other. Here we expect that the # second PIP will be much larger than the first. z = c(6,7) R = matrix(1,2,2) fit = susie_rss(z,R) print(fit$sets$cs) print(fit$pip)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.