knitr::opts_chunk$set( # code or die echo = TRUE, # minimize verbosity warning = FALSE, message = FALSE, # dpi = 150, # for hires images comment = "#>") set.seed(0xFEED) options(digits = 3)
This demonstrates how to use the facile.bio ecosystem to perform the analyses of RNA-seq data as outlined in the edgeR QLF From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline workflow.
library(edgeR) library(dplyr) library(FacileBiocData) library(FacileAnalysis) library(ggplot2) theme_set(theme_bw())
If you're here, we already assume you know what a DGEList is, so we've skipped the whole "exploration of the DGEList structure" part of the workflow and have encapsulated the DGEList building code to an internal function that we use here:
source(system.file("scripts", "edgeRQLF-workflow.R", package="FacileAnalysis")) y <- assemble_edgeRQLF_DGEList()
From here on out, we will perform these analyses using only the maneuvers made available in this package and the greater facile.bio ecosystem. We have an extensive amount of unit tests in this package that ensure that the results generated by the FacileAnalysis package corresponds to what is produced using standard Bioconductor maneuvers, which you can refer to.
You can also refer to the sister FacileRnaSeqGeneEdgeRQL-test.Rmd vignette to see how the exact statistics generated here match the ones produced using the standard edgeR/QLF framework.
In order to directly use this DGEList within the facile framework, we just have to invoke the FacileBiocData::facilitate function:
yf <- facilitate(y)
For now, we only support PCA:
pca.y <- fpca(yf, ntop = 500) colors <- c(lactating = "darkgreen", pregnant = "red", virgin = "blue") viz(pca.y, color_aes = "Status", shape_aes = "CellType", color_map = colors, width = 600)
Test differential among lactating b cells vs pregnant b cells.
Feture filtering is performed by default:
dge.b <- yf |> flm_def("group", numer = "B.lactating", denom = "B.pregnant") |> fdge(method = "edgeR-qlf")
tidy(dge.b) |> head(10)
The MA plots in facile are interactive, so we don't want to overload the web-browser. Here we'll chose to show a sample of both significant and "background" genes.
bg.genes <- tidy(dge.b) |> filter(padj > 0.1) |> sample_n(200) fg.genes <- tidy(dge.b) |> filter(padj < 0.1) |> sample_n(1000) viz(dge.b, type = "MA", features = bg.genes, highlight = fg.genes, width = 600)
viz(dge.b, type = "volcano", features = bg.genes, highlight = head(fg.genes, 100), width = 600)
We'll re-use the model from the original analsis and tweak the fdge parameters. Instead of automatically filtering, we just specify to use the same features from the previous analysis.
dge.bthres <- dge.b |> model() |> fdge(method = "edgeR-qlf", treat_lfc = log2(1.5), features = features(dge.b))
tidy(dge.bthres) |> head(10)
:::note The workflow goes into a heatmap section, but we will save that for later. There are more interesting differential expression examples in the workflow that we will continue here. :::
Suppose we want to compare the three groups within in the luminal population, i.e., virgin, pregnant and lactating:
anova.luminal <- yf |> filter_samples(CellType == "L") |> flm_def("Status") |> fdge(method = "edgeR-qlf")
tidy(anova.luminal) |> head(10)
The contrast defined in the workflow is basically testing an interaction:
con <- makeContrasts( (L.lactating-L.pregnant)-(B.lactating-B.pregnant), levels=design)
The facile way is to perform this analysis is to run the two individual differentail expression analyses independently, then compare the two.
For instance, we already have the "lactating"
vs "pregnant"
comparison
within the B cell population stored in dge.b
. We now have to run the
same comparison within the lumina cells:
dge.L <- yf |> flm_def("group", numer = "L.lactating", denom = "L.pregnant") |> fdge(method = "edgeR-qlf")
And now run the interaction. By default, each analysis will be rerun using the union of the (filtered) features from the individual comparisons so that we generate statistics for all genes from each comparison.
dge.i <- compare(dge.L, dge.b)
Facile heatmaps are not supported yet, but we can show how to use iheatmapr produce the same heatmap.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.