pFdr | R Documentation |
This function calculates either (1) resampling based p-values with subsequent
p-value adjustment using stats::p.adjust
or (2)
resampling based false-discovery-rates (FDRs) for rotated statistics from a
rotateStat
object.
pFdr(obj, method = "none", pooled = TRUE, na.rm = FALSE, beta = 0.05)
obj |
A |
method |
Can be either |
pooled |
|
na.rm |
|
beta |
|
Larger values of obj$s0 are considered more significant when
compared to the empirical distribution. E.g. for calculation of resampling
based p-values (with pooled = FALSE
) we in principle use
p.val <- (rowSums(obj$stats >= obj$s0)+1)/(ncol(obj$stats)+1)
according to
\insertCitePhipson2010randRotation.
method = "fdr.q"
and method = "fdr.qu"
are resampling based
fdr estimates and can only be used with pooled = TRUE
. method
= "fdr.q"
is the FDR local estimator and method = "fdr.qu"
is the
FDR upper limit, see \insertCiteReiner2003,Yekutieli1999randRotation.
For all other method
arguments resampling based p-values are
calculated and passed to stats::p.adjust
for
p-value adjustment. So these methods provide resampling based p-values with
(non-resampling based) p-value adjustment.
method = "fdr.q"
and method = "fdr.qu"
were
adapted from package fdrame
\insertCiteFdrame2019,Reiner2003randRotation.
When pooled = TRUE
,
marginal distributions of the test statistics are considered exchangeable
for all features. The resampling based p-values of each feature are then
calculated from all rotated statistics (all features, all rotations). For
these cases, if the number of features is reasonably large, usually only
few resamples (argument R
in
rotateStat
) are required.
We want to emphasize that in order for the marginal distributions to be
exchangeable, the statistics must be a pivotal quantity (i.e. it must be
scale independent).
Pivotal quantities are e.g. t values. Using e.g. linear models
with coef
as statistics is questionable if the different features
are measured on different scales. The resampled coefficients then
have different variances and pooled = TRUE
is not applicable.
We thus highly recommend using pivotal quantities as statistics
in
rotateStat
if possible.
When pooled = FALSE
the resampling based p-values are calculcated
for each feature separately. This is required if one expects the resampling
based statistics to be distributed differently for individual features. For
most common applications this should not be the case and the marginal
distribution are exchangeable for all features, hence pooled = TRUE
by default.
If method = "fdr.q"
or method = "fdr.qu"
and
weights
were specified when initialising the random rotation
object (see parameter initialised.obj
in
rotateStat
), a warning is displayed.
The correlation structure (dependence structure) of linear model
coefficients between different features is not generally preserved if
different weights are used for different features.
Methods fdr.q
and fdr.qu
rely on preserved correlation
structure of dependent statistics and thus should not be used if statistics
based on model coefficients (e.g. t statistics of model coefficients) are
used in combination with different weights.
P-values and FDRs are calculated for each column of obj$s0
separately.
A numeric
matrix of corrected p-values or FDRs with dimension dim(obj$s0)
.
Peter Hettegger
rotateStat
# See also '?rotateStat':
#set.seed(0)
# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100
# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))
mod1 <- model.matrix(~phenotype, pdata)
# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1
# Definition of the batch effect correction procedure with subsequent calculation
# of two-sided test statistics
statistic <- function(., batch, mod, coef){
# The "capture.output" and "suppressMessages" simply suppress any output
capture.output(suppressMessages(
Y.tmp <- sva::ComBat(., batch = batch, mod)
))
fit1 <- lm.fit(mod, t(Y.tmp))
abs(coef(fit1)[coef,])
}
# We calculate test statistics for the second coefficient
res1 <- rotateStat(initialised.obj = init1,
R = 10,
statistic = statistic,
batch = pdata$batch, mod = mod1, coef = 2)
hist(pFdr(res1))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.