aggregateData: Aggregation of single-cell to pseudobulk data

Description Usage Arguments Value Author(s) References Examples

View source: R/aggregateData.R

Description

...

Usage

1
2
3
4
5
6
7
aggregateData(
  x,
  assay = NULL,
  by = c("cluster_id", "sample_id"),
  fun = c("sum", "mean", "median"),
  scale = FALSE
)

Arguments

x

a SingleCellExperiment.

assay

character string specifying the assay slot to use as input data. Defaults to the 1st available (assayNames(x)[1]).

by

character vector specifying which colData(x) columns to summarize by (at most 2!).

fun

a character string. Specifies the function to use as summary statistic.

scale

logical. Should pseudo-bulks be scaled with the effective library size & multiplied by 1M?

Value

a SingleCellExperiment.

Aggregation parameters (assay, by, fun, scaled) are stored in metadata()$agg_pars, and the number of cells that were aggregated are accessible in metadata()$n_cells.

Author(s)

Helena L Crowell & Mark D Robinson

References

Crowell, HL, Soneson, C, Germain, P-L, Calini, D, Collin, L, Raposo, C, Malhotra, D & Robinson, MD: On the discovery of population-specific state transitions from multi-sample multi-condition single-cell RNA sequencing data. bioRxiv 713412 (2018). doi: https://doi.org/10.1101/713412

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
data(sce)
library(SingleCellExperiment)

# pseudobulk counts by cluster-sample
pb <- aggregateData(sce)

assayNames(sce)  # one sheet per cluster
head(assay(sce)) # n_genes x n_samples

# scaled CPM
assays(sce)$cpm <- edgeR::cpm(assay(sce))
pb <- aggregateData(sce, assay = "cpm", scale = TRUE)
head(assay(pb)) 

# aggregate by cluster only
pb <- aggregateData(sce, by = "cluster_id")
length(assays(pb)) # single assay
head(assay(pb))    # n_genes x n_clusters

muscat documentation built on Nov. 8, 2020, 7:47 p.m.