vizWithSCE is an R package that assists with the visualization of the integration of bulk DE results tables with pre-processed scRNA-seq datasets available on Bioconductor. vizWithSCE currently takes two arguments, a list containing res, dds, and a SingleCellExperiment (sce) as selected by the user and a second argument which. The which argument is the gene to use for the violin plot, represented as a number that corresponds to the gene with the n-th smallest p-value, e.g., 1 for the gene with lowest p-value, etc.
We first load gene-summarized RNA-seq counts.
library(airway)
For our workflow, we used the bulk dataset "airway", which contains four human airway smooth muscle cell lines treated with dexamethasone, a steroid used to treat inflammation. This pre-processed dataset is available on Bioconductor. Here we filter to at least 3 samples with a count of 10 or higher, and then perform differential expression analysis.
Here, we changed the rownames to match the format of the sce.
library(DESeq2) data(gse) levels(gse$condition) <- c("untrt", "trt") dds <- DESeqDataSet(gse, design = ~donor + condition) keep <- rowSums(counts(dds) >= 10) >= 3 dds <- dds[keep,] dds <- DESeq(dds) res <- results(dds) rownames(res) <- sub("\\..*","",rownames(res))
# the following is evaluated but hidden, # because the above un-evaluated code chunk # is slow to run library(vizWithSCE) data(dds) dds <- DESeq2::estimateSizeFactors(dds) data(res)
integrateWithSingleCell
We run integrateWithSingleCell
and pick number 8 from the dataset
selection as our single-cell dataset. Within PBMC there are nine
different scRNA-seq datasets. We chose the dataset pbmc4k, which
contains roughly 4 thousand cells. This pre-processed dataset is
available on Bioconductor.
dat <- integrateWithSingleCell(res, dds)
# the following is evaluated but hidden, # because the above un-evaluated code chunk # involves user input. library(TENxPBMCData) library(scater) tenx_pbmc4k <- TENxPBMCData(dataset = "pbmc4k") sce <- tenx_pbmc4k dat <- list(res=res, dds=dds, sce=sce) dat$sce <- logNormCounts(dat$sce)
We use scater's logNormCounts
function to add normalized data to
the sce
. We create a copy of the sce
object in order to perform
cell type labeling.
library(scater) dat$sce <- scater::logNormCounts(dat$sce) sce2 <- dat$sce
We convert the rownames from Ensembl to gene symbols, to match with the cell annotation reference dataset, described below.
library(org.Hs.eg.db) rownames(sce2) <- mapIds(org.Hs.eg.db, rownames(sce2), "SYMBOL", "ENSEMBL")
We use celldex which is a database of reference expression datasets
along with SingleR to provide cell type recognition. Now dat
contains labels of different cell types.
library(celldex) library(SingleR) ref <- celldex::BlueprintEncodeData() pred <- SingleR::SingleR(test=sce2, ref=ref, labels=ref$label.main) table(pred$labels) colLabels(dat$sce) <- pred$labels
# the following is evaluated but hidden, # because the above un-evaluated code chunk # is slow to run data(labels) colLabels(dat$sce) <- labels
We use vizWithSCE
to pick the second gene by smallest p-value, and
make a violin plot over the cell labels from SingleR.
DUSP1 is known to play a role in response to environmental stress,
and is most highly expressed in monocytes according to the single cell
data.
library(vizWithSCE) vizWithSCE(dat, which=2)
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.