import | R Documentation |
This function imports fully processed expression datasets and results of
differential expression (DE) analysis from limma, edgeR, and DESeq2.
The imported data is converted to a SummarizedExperiment
,
annotating experimental design and genewise differential expression, which
then allows straightforward application of enrichment analysis methods.
import(obj, res, from = c("auto", "limma", "edgeR", "DESeq2"), anno = NA)
obj |
Expression data object. Supported options include
|
res |
Differential expression results. Expected to match the provided expression data object type, i.e. should be an object of class
|
from |
Character. Differential expression method from which to import
results from. Defaults to |
anno |
Character. The organism under study in KEGG three letter code (e.g. ‘hsa’ for ‘Homo sapiens’). For microarray probe-level data: the ID of a recognized microarray platform. See references. |
The expression data object (argument obj
) is expected to be fully
processed (including normalization and dispersion estimation) and to have
the experimental design annotated.
The experimental design is expected to describe *a comparison of two groups*
with an optional blocking variable for paired samples / sample batches (i.e.
design = ~ group
or design = ~ batch + group
.)
The differential expression result (argument res
) is expected to have
the same number of rows as the expression data object,
and also that the order of the rows is the same / consistent, i.e. that there
is a 1:1 correspondence between the rownames of obj
and the rownames
of res
. Note that the expression dataset is automatically restricted
to the genes for which DE results are available. However, for an appropriate
estimation of the size of the universe for competitive gene set tests, it is
recommended to provide DE results for all genes in the expression data object
whenever possible (see examples).
An object of class SummarizedExperiment
that
stores
the expression matrix in the assay
slot,
information about the samples, including the experimental design, in the
colData
slot, and
information about the genes, including measures of differential expression,
in the rowData
slot.
Mandatory annotations:
colData column storing binary group assignment (named "GROUP")
rowData column storing (log2) fold changes of differential expression between sample groups (named "FC")
rowData column storing adjusted (corrected for multiple testing) p-values of differential expression between sample groups (named "ADJ.PVAL").
Additional optional annotations:
colData column defining paired samples or sample blocks (named "BLOCK")
metadata slot named "annotation" giving the organism under investigation in KEGG three letter code (e.g. "hsa" for Homo sapiens)
metadata slot named "dataType" indicating the expression data type ("ma" for microarray, "rseq" for RNA-seq).
Ludwig Geistlinger
KEGG Organism code http://www.genome.jp/kegg/catalog/org_list.html
Recognized microarray platforms http://www.bioconductor.org/packages/release/BiocViews.html#___ChipName
readSE
for reading expression data from file,
normalize
for normalization of expression data,
voom
for preprocessing of RNA-seq data, p.adjust
for multiple testing correction, eBayes
for DE analysis with
limma, glmQLFit
for DE analysis with edgeR, and
DESeq
for DE analysis with DESeq2.
# Setup
## i) Generate count data
nsamples <- 4
ngenes <- 1000
dispers <- 1 / rchisq(ngenes, df = 10)
rdesign <- model.matrix(~factor(rep(c(1, 2), each = 2)))
counts <- rnbinom(ngenes * nsamples, mu = 20, size = 1 / dispers)
counts <- matrix(counts, nrow = ngenes, ncol = nsamples)
## ii) Generate intensity data
sd <- 0.3 * sqrt(4 / rchisq(100, df = 4))
intens <- matrix(rnorm(100 * 6, sd = sd), nrow = 100, ncol = 6)
rownames(intens) <- paste0("Gene", 1:100)
intens[1:2, 4:6] <- intens[1:2, 4:6] + 2
mdesign <- cbind(Grp1 = 1, Grp2vs1 = rep(c(0,1), each = 3))
# (1) import from edgeR (RNA-seq count data)
# (1a) create the expression data object
library(edgeR)
d <- DGEList(counts)
d <- calcNormFactors(d)
d <- estimateDisp(d, rdesign)
# (1b) obtain differential expression results
fit <- glmQLFit(d, rdesign)
qlf <- glmQLFTest(fit)
res <- topTags(qlf, n = nrow(d), sort.by = "none")
# (1c) import
se <- import(d, res)
# (2) import from DESeq2 (RNA-seq count data)
# (2a) create the expression data object
library(DESeq2)
condition <- factor(rep(c("A", "B"), each = 2))
dds <- DESeqDataSetFromMatrix(counts,
colData = DataFrame(condition = condition),
design = ~ condition)
# (2b) obtain differential expression results
dds <- DESeq(dds)
res <- results(dds)
# (2c) import
se <- import(dds, res)
# (3) import from voom/limma (RNA-seq count data)
# (3a) create the expression data object
library(limma)
keep <- filterByExpr(counts, rdesign)
el <- voom(counts[keep,], rdesign)
# (3b) obtain differential expression results
fit <- lmFit(el, rdesign)
fit <- eBayes(fit, robust = TRUE)
res <- topTable(fit, coef = 2, number = nrow(counts), sort.by = "none")
# (3c) import
se <- import(el, res)
# (4) import from limma-trend (RNA-seq count data)
# (4a) create the expression data object
logCPM <- edgeR::cpm(counts[keep,], log = TRUE, prior.count = 3)
el <- new("EList", list(E = logCPM, design = rdesign))
# (4b) obtain differential expression results
fit <- lmFit(el, rdesign)
fit <- eBayes(fit, trend = TRUE)
res <- topTable(fit, coef = 2, number = nrow(el), sort.by = "none")
# (4c) import
se <- import(el, res)
# (5) import from limma (microarray intensity data)
# (5a) create the expression data object
el <- new("EList", list(E = intens, design = mdesign))
# (5b) obtain differential expression results
fit <- lmFit(el, mdesign)
fit <- eBayes(fit, robust = TRUE)
res <- topTable(fit, coef = 2, number = nrow(el), sort.by = "none")
# (5c) import
se <- import(el, res)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.