| BBUM_plot | R Documentation |
Useful graphs for checking and viewing the results of BBUM correction and significance calling.
BBUM_plot(
df.bbum,
option = c("MA", "volcano", "hist", "ecdf", "ecdf_log", "ecdf.corr", "ecdf_log.corr",
"pp", "pcorr", "symm", "confusion"),
expressionCol = "baseMean",
pBBUM.alpha = 0.05,
two_tailed = FALSE
)
df.bbum |
The data.frame output of |
option |
Option of graph to plot. If a vector of length > 1 is provided, only the first element is used. Ignores case. |
expressionCol |
A |
pBBUM.alpha |
Cutoff level of |
two_tailed |
Toggle the "two-tailed" case of BBUM correction, if the
background assumption is weak and bona fide hits in the background class
are relevant. See Details. Default behavior is off. (Only for
|
The argument expressionCol allows plotting the MA graph
against a specified column as the x axis (expression level). For instance,
some may prefer to plot the mean normalized expression in control
experiments only, rather than the default "baseMean" of DESeq2.
Graph options are:
MA: MA plot (log2FoldChange against expressionCol)
volcano: Volcano plot (-log10(pvalue) against
log2FoldChange)
hist: p value histogram separated into signal and background set
points, with BBUM model overlaid. Background histogram is normalized by a
factor of 1 - theta to account for the lack of primary effects for
comparison.
ecdf: p value ECDF separated into signal and background set
points, with BBUM model overlaid.
ecdf_log: ECDF in log scale for the p values, which helps to focus
on the left-tail.
ecdf.corr, ecdf_log.corr: Plots the pBBUM values
instead to evaluate the FDR-corrected p values.
pp: P-P plot to evaluate the goodness of fit.
pcorr: Plot of p values from raw values to BBUM-FDR_corrected
values, by data set. This plot is helpful for evaluating the correction of
individual p values through the BBUM algorithm.
symm: Modified symmetry plot of -log10(p) values, excluding
hits, with -log10(0) as the mid-point instead of the median. Uses
subsampling to account for the different number of points in the signal and
background sets.
confusion: Plot of expected FDR, sensitivity, and specificity at
each value of raw p-values, which shows the trade-off between these metrics
for the given dataset's BBUM model.
The most critical region of BBUM distribution for an appropriate correction for secondary effects is the "left-tail" around 0, where both primary and secondary beta components peak. An ECDF graph in log scale allows emphasis and better visualization of this region.
ECDF graphs are overlaid on the x = y diagonal line, which
represents the uniform/null-only i.e. no secondary effects case.
Because the peak near p = 0 is the most informative region
for p values correction, a P-P plot is more appropriate to assess
goodness-of-fit of BBUM models than a Q-Q plot.
Plot symm is for the validation of the assumption that the
signal and background sets have roughly similar background (null and
secondary effects) distributions of p values. As excluding hits does not
exclude the false-negative region, there is still an expected discrepancy
at low p values. The implemented color gradient attempts to reflect
this expected up-deviation from the diagonal line when the fraction of
remaining primary effects is large. Empirically, distributions that do not
deviate from the +/- log(10) dashed lines when the expected primary
effects fractions is low are symmetrical enough for accurate BBUM
correction.
ggplot2 plot object.
## Not run:
BBUM_plot(df.bbum = res.BBUMcorr,
option = "ecdf_log",
expressionCol = "WTmean",
pBBUM.alpha = 0.01,
two_tailed = FALSE
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.