runDE | R Documentation |
This function applies selected methods for differential expression (DE) analysis to selected datasets of an expression data compendium.
runDE(
exp.list,
de.method = c("limma", "edgeR", "DESeq2"),
padj.method = "flexible",
parallel = NULL,
...
)
metaFC(exp.list, max.na = round(length(exp.list)/3))
writeDE(exp.list, out.dir = NULL)
plotDEDistribution(exp.list, alpha = 0.05, beta = 1)
plotNrSamples(exp.list)
exp.list |
Experiment list. A |
de.method |
Differential expression method. See documentation of
|
padj.method |
Method for adjusting p-values to multiple testing. For
available methods see the man page of the stats function |
parallel |
Parallel computation mode. An instance of class
|
... |
Additional arguments passed to |
max.na |
Integer. Determines for which genes a meta fold change is
computed. Per default, excludes genes for which the fold change is not
annotated in >= 1/3 of the datasets in |
out.dir |
Character. Determines the output directory where DE results
for each dataset are written to. Defaults to |
alpha |
Statistical significance level. Defaults to 0.05. |
beta |
Absolute log2 fold change cut-off. Defaults to 1 (2-fold). |
DE studies typically report a gene as differentially expressed if the corresponding DE p-value, corrected for multiple testing, satisfies the chosen significance level. Enrichment methods that work directly on the list of DE genes are then substantially influenced by the multiple testing correction.
An example is the frequently used over-representation analysis (ORA), which
assesses the overlap between the DE genes and a gene set under study based
on the hypergeometric distribution (see Appendix A of the
EnrichmentBrowser
vignette for an introduction).
ORA is inapplicable if there are few genes satisfying the significance threshold, or if almost all genes are DE.
Using padj.method="flexible"
accounts for these cases by applying
multiple testing correction in dependence on the degree of differential
expression:
the correction method from Benjamini and Hochberg (BH) is applied if it renders >= 1% and <= 25% of all measured genes as DE,
the p-values are left unadjusted, if the BH correction results in < 1% DE genes, and
the more stringent Bonferroni correction is applied, if the BH correction results in > 25% DE genes.
Note that resulting p-values should not be used for assessing the statistical significance of DE genes within or between datasets. They are solely used to determine which genes are included in the analysis with ORA - where the flexible correction ensures that the fraction of included genes is roughly in the same order of magnitude across datasets.
Alternative stratgies could also be applied - such as taking a constant number of genes for each dataset or excluding ORA methods in general from the assessment.
runDE
returns exp.list
with DE measures annotated to
the rowData
slot of each dataset, writeDE
writes to file,
and plotDEDistribution
plots to a graphics device.
Ludwig Geistlinger <Ludwig.Geistlinger@sph.cuny.edu>
loadEData
to load a specified expression data compendium.
# reading user-defined expression data from file
data.dir <- system.file("extdata/myEData", package="GSEABenchmarkeR")
edat <- loadEData(data.dir)
# differential expression analysis
edat <- runDE(edat)
# visualization of per-dataset DE distribution
plotDEDistribution(edat)
# calculating meta fold changes across datasets
mfcs <- metaFC(edat, max.na=0)
# writing DE results to file
out.dir <- tempdir()
out.dir <- file.path(out.dir, "de")
if(!file.exists(out.dir)) dir.create(out.dir)
writeDE(edat, out.dir)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.