Description Usage Arguments Details Value References Examples
apeglm provides Bayesian shrinkage estimators for effect sizes in GLM models, using approximation of the posterior for individual coefficients.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  apeglm(
Y,
x,
log.lik,
param = NULL,
coef = NULL,
mle = NULL,
no.shrink = FALSE,
interval.type = c("laplace", "HPD", "credible"),
interval.level = 0.95,
threshold = NULL,
contrasts,
weights = NULL,
offset = NULL,
flip.sign = TRUE,
prior.control,
multiplier = 1,
ngrid = 50,
nsd = 5,
ngrid.nuis = 5,
nsd.nuis = 2,
log.link = TRUE,
param.sd = NULL,
method = c("general", "nbinomR", "nbinomCR", "nbinomC", "nbinomC*", "betabinR",
"betabinCR", "betabinC", "betabinC*"),
optim.method = "BFGS",
bounds = c(Inf, Inf)
)

Y 
the observations, which can be a matrix or SummarizedExperiment,
with columns for samples and rows for "features" (e.g. genes in a genomic context).
If Y is a SummarizedExperiment, 
x 
design matrix, with intercept in the first column. Continuousvalued columns should be centered and scaled to unit variance, or at least set so that the scale (in SD) is not very large or very small 
log.lik 
the log of likelihood function, specified by the user.
For Negative Binomial distribution, user can use 
param 
the other parameter(s) to be used in the likelihood function, e.g. the dispersion parameter for a negative binomial distribution. this can be a vector or a matrix (with columns as parameters) 
coef 
(optional) the index of the coefficient for which to generate posterior estimates (FSR and intervals) 
mle 
(optional) a 2 column matrix giving the MLE and its standard error
of 
no.shrink 
logical, if TRUE, apeglm won't perform shrinkage (default is FALSE) 
interval.type 
(optional) can be "laplace", "HPD", or "credible", which specifies the type of Bayesian interval that the user wants to output; "laplace" represents the Laplace approximation of the posterior mode 
interval.level 
(optional) default is 0.95 
threshold 
(optional) a threshold for integrating posterior probabilities, see details under 'Value'. Note that this should be on the natural log scale for a log link GLM. 
contrasts 
(optional) contrast matrix, same number of rows as 
weights 
(optional) weights matrix, same shape as 
offset 
(optional) offsets matrix, same shape as 
flip.sign 
whether to flip the sign of threshold value when MAP is negative, default is TRUE (threshold must then be positive) 
prior.control 
see Details 
multiplier 
a positive number, when the prior is adapted to the 
ngrid 
the number of grid points for grid integration of intervals 
nsd 
the number of SDs of the Laplace posterior approximation to set the left and right edges of the grid around the MAP 
ngrid.nuis 
the number of grid points for nuisance parameters 
nsd.nuis 
the number of Laplace standard errors to set the left and right edges of the grid around the MAP of the nuisance parameters 
log.link 
whether the GLM has a log link (default = TRUE) 
param.sd 
(optional) potential uncertainty measure on the parameter 
method 
options for how apeglm will find the posterior mode and SD.
The default is "general" which allows the user to specify a likelihood
in a general way. Alternatives for faster performance with the Negative Binomial
likelihood are:
"nbinomR", "nbinomCR", and "nbinomC" / "nbinomC*"
These alternative methods should provide increasing speeds,
respectively.
(Also for beta binomial, substitute "betabin" for "nbinom" in the
above.)
From testing on RNAseq data, the nbinom methods are roughly 5x, 10x and 50x faster than "general".
Note that "nbinomC" uses C++ to find the MAP for the coefficients,
but does not calculate or return the posterior SD or other quantities.
"nbinomC*" is the same as "nbinomC", but includes a random start for finding the MAP.
"nbinomCR" uses C++ to calculate the MAP and then estimates
the posterior SD in R, with the exception that if the MAP from C++
did not converge or gives negative estimates of posterior variance,
then this row is refit using optimization in R.
These alternatives require the degrees of freedom for the prior distribution to be 1,
and will ignore any function provided to the 
optim.method 
the method passed to 
bounds 
the bounds for the numeric optimization 
prior.control
is a list of parameters that will be passed to determine
the prior distribution. Users are allowed to have a Normal prior on the
intercept, and a t prior on the nonintercept coefficients (similar
to bayesglm
in the arm
package. The following are defaults:
no.shrink = 1
: index of the coefficient(s) not to shrink
prior.mean = 0
: mean of t prior
prior.scale = 1
: scale of t prior
prior.df = 1
: df of t prior
prior.no.shrink.mean = 0
: mean of Normal
prior.no.shrink.scale = 15
: scale of Normal
So without specifying prior.control
, the following is set inside apeglm
:
prior.control < list(no.shrink=1,prior.mean=0,prior.scale=1,
prior.df=1,prior.no.shrink.mean=0,prior.no.shrink.scale=15)
Note that the prior should be defined on the natural log scale for a log link GLM.
a list of matrices containing the following components:
map
: matrix of MAP estimates, columns for coefficients and rows for features
sd
: matrix of posterior SD, same shape as map
prior.control
: list with details on the prior
fsr
: vector of the false sign rate for coef
interval
: matrix of either HPD or credible interval for coef
thresh
: vector of the posterior probability that the estimated parameter
is smaller than the threshold value specified in threshold
when MAP is positive (or greater than
1 * threshold value when MAP is negative and flip.sign is TRUE)
diag
: matrix of diagnostics
contrast.map
: vector of MAP estimates corresponding to the contrast
when contrast
is given
contrast.sd
: vector of posterior SD corresponding to the contrast
when contrast
is given
ranges
: GRanges or GRangesList with the estimated coefficients,
if Y
was a SummarizedExperiment.
Note that all parameters associated with coefficients,
e.g. map
, sd
, etc., are returned on the natural log scale for a log link GLM.
Adaptive Cauchy prior:
Zhu, A. (2018) Heavytailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics. doi: 10.1093/bioinformatics/bty895
False sign rate and svalue:
Stephens, M. (2016) False discovery rates: a new deal. Biostatistics, 18:2. doi: 10.1093/biostatistics/kxw041
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44  # Simulate RNASeq read counts data
# 5 samples for each of the two groups
# a total of 100 genes
n.per.group < 5
n < n.per.group * 2
m < 100
# The design matrix includes one column of intercept
# and one column indicating samples that belong to the second group
condition < factor(rep(letters[1:2], each = n.per.group))
x < model.matrix(~condition)
# Specify the standard deviation of beta (LFC between groups)
beta.sd < 2
beta.cond < rnorm(m, 0, beta.sd)
beta.intercept < runif(m, 2, 6)
beta.mat < cbind(beta.intercept, beta.cond)
# Generate the read counts
mu < exp(t(x %*% t(beta.mat)))
Y < matrix(rnbinom(m*n, mu=mu, size=1/.1), ncol = n)
# Here we will use the negative binomial log likelihood
# which is an exported function. See 'logLikNB' for details.
# For the NB:
# 'param' is the dispersion estimate (1/size)
# 'offset' can be used to adjust for size factors (log of size factors)
param < matrix(0.1, nrow = m, ncol = 1)
offset < matrix(0, nrow = m, ncol = n)
# Shrinkage estimator of betas:
# (for adaptive shrinkage, 'apeglm' requires 'mle' coefficients
# estimated with another software, or by first running 'apeglm'
# setting 'no.shrink=TRUE'.)
res < apeglm(Y = Y, x = x,
log.lik = logLikNB,
param = param,
offset = offset,
coef = 2)
head(res$map)
plot(beta.mat[,2], res$map[,2])
abline(0,1)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.