removeAmbience: Remove the ambient profile

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/removeAmbience.R

Description

Estimate and remove the ambient profile from a count matrix, given pre-existing groupings of similar cells. This function is largely intended for plot beautification rather than real analysis.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
removeAmbience(
  y,
  ambient,
  groups,
  features = NULL,
  ...,
  size.factors = librarySizeFactors(y),
  dispersion = 0.1,
  sink = NULL,
  BPPARAM = SerialParam()
)

Arguments

y

A numeric matrix-like object containing counts for each gene (row) and cell/library (column).

ambient

A numeric vector of length equal to nrow(y), containing the proportions of transcripts for each gene in the ambient solution.

groups

A vector of length equal to ncol(y), specifying the assigned group for each cell. This can also be a DataFrame, see ?sumCountsAcrossCells.

features

A vector of control features or a list of mutually exclusive feature sets, see ?controlAmbience for more details.

...

Further arguments to pass to maximumAmbience.

size.factors

Numeric scalar specifying the size factors for each column of y, defaults to library size-derived size factors.

dispersion

Numeric scalar specifying the dispersion to use in the quantile-quantile mapping.

sink

An optional RealizationSink object of the same dimensions as y.

BPPARAM

A BiocParallelParam object specifying how parallelization should be performed.

Details

This function will aggregate counts from each group of related cells into an average profile. For each group, we estimate the contribution of the ambient profile and subtract it from the average. By default, this is done with maximumAmbience, but if enough is known about the biological system, users can specify feaures to use controlAmbience instead.

We then perform quantile-quantile mapping of counts in y from the old to new averages. This approach preserves the mean-variance relationship and improves the precision of estimate of the ambient contribution, but relies on a sensible grouping of similar cells, e.g., unsupervised clusters or cell type annotations. As such, this function is best used at the end of the analysis to clean up expression matrices prior to visualization.

Value

A numeric matrix-like object of the same dimensions as y, containing the counts after removing the ambient contamination. The exact representation of the output will depend on the class of y and whether sink was used.

Author(s)

Aaron Lun

See Also

maximumAmbience and controlAmbience, to estimate the ambient contribution.

estimateAmbience, to estimate the ambient profile.

The SoupX package, which provides another implementation of the same general approach.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Making up some data.
ngenes <- 1000
ambient <- runif(ngenes, 0, 0.1)
cells <- c(runif(100) * 10, integer(900))
y <- matrix(rpois(ngenes * 100, cells + ambient), nrow=ngenes)

# Pretending that all cells are in one group, in this example.
removed <- removeAmbience(y, ambient, groups=rep(1, ncol(y)))
summary(rowMeans(removed[1:100,]))
summary(rowMeans(removed[101:1000,]))

DropletUtils documentation built on Feb. 4, 2021, 2:01 a.m.