pbDS: pseudobulk DS analysis

View source: R/pbDS.R

pbDSR Documentation

pseudobulk DS analysis

Description

pbDS tests for DS after aggregating single-cell measurements to pseudobulk data, by applying bulk RNA-seq DE methods, such as edgeR, DESeq2 and limma.

Usage

pbDS(
  pb,
  method = c("edgeR", "DESeq2", "limma-trend", "limma-voom", "DD"),
  design = NULL,
  coef = NULL,
  contrast = NULL,
  min_cells = 10,
  filter = c("both", "genes", "samples", "none"),
  treat = FALSE,
  verbose = TRUE,
  BPPARAM = SerialParam(progressbar = verbose)
)

pbDD(
  pb,
  design = NULL,
  coef = NULL,
  contrast = NULL,
  min_cells = 10,
  filter = c("both", "genes", "samples", "none"),
  verbose = TRUE,
  BPPARAM = SerialParam(progressbar = verbose)
)

Arguments

pb

a SingleCellExperiment containing pseudobulks as returned by aggregateData.

method

a character string.

design

For methods "edegR" and "limma", a design matrix with row & column names(!) created with model.matrix; For "DESeq2", a formula with variables in colData(pb). Defaults to ~ group_id or the corresponding model.matrix.

coef

passed to glmQLFTest, contrasts.fit, results for method = "edgeR", "limma-x", "DESeq2", respectively. Can be a list for multiple, independent comparisons.

contrast

a matrix of contrasts to test for created with makeContrasts.

min_cells

a numeric. Specifies the minimum number of cells in a given cluster-sample required to consider the sample for differential testing.

filter

character string specifying whether to filter on genes, samples, both or neither.

treat

logical specifying whether empirical Bayes moderated-t p-values should be computed relative to a minimum fold-change threshold. Only applicable for methods "limma-x" (treat) and "edgeR" (glmTreat), and ignored otherwise.

verbose

logical. Should information on progress be reported?

BPPARAM

a BiocParallelParam object specifying how differential testing should be parallelized.

Value

a list containing

  • a data.frame with differential testing results,

  • a DGEList object of length nb.-clusters, and

  • the design matrix, and contrast or coef used.

Author(s)

Helena L Crowell & Mark D Robinson

References

Crowell, HL, Soneson, C, Germain, P-L, Calini, D, Collin, L, Raposo, C, Malhotra, D & Robinson, MD: On the discovery of population-specific state transitions from multi-sample multi-condition single-cell RNA sequencing data. bioRxiv 713412 (2018). doi: https://doi.org/10.1101/713412

Examples

# simulate 5 clusters, 20% of DE genes
data(example_sce)
    
# compute pseudobulk sum-counts & run DS analysis
pb <- aggregateData(example_sce)
res <- pbDS(pb, method = "limma-trend")

names(res)
names(res$table)
head(res$table$stim$`B cells`)

# count nb. of DE genes by cluster
vapply(res$table$stim, function(u) 
  sum(u$p_adj.loc < 0.05), numeric(1))

# get top 5 hits for ea. cluster w/ abs(logFC) > 1
library(dplyr)
lapply(res$table$stim, function(u)
  filter(u, abs(logFC) > 1) %>% 
    arrange(p_adj.loc) %>% 
    slice(seq_len(5)))


HelenaLC/muscat documentation built on Oct. 9, 2024, 11:59 a.m.