removeAmbience: Remove the ambient profile

removeAmbienceR Documentation

Remove the ambient profile

Description

Estimate and remove the ambient profile from a count matrix, given pre-existing groupings of similar cells. This function is largely intended for plot beautification rather than real analysis.

Usage

removeAmbience(y, ...)

## S4 method for signature 'ANY'
removeAmbience(
  y,
  ambient,
  groups,
  features = NULL,
  ...,
  size.factors = librarySizeFactors(y),
  dispersion = 0.1,
  sink = NULL,
  BPPARAM = SerialParam()
)

## S4 method for signature 'SummarizedExperiment'
removeAmbience(y, ..., assay.type = "counts")

Arguments

y

A numeric matrix-like object containing counts for each gene (row) and cell or group of cells (column). Alternatively, a SummarizedExperiment containing such a matrix.

...

For the generic, further arguments to pass to specific methods.

For the SummarizedExperiment method, further arguments to pass to the ANY method.

For the ANY method, Further arguments to pass to ambientContribMaximum.

ambient

A numeric vector of length equal to nrow(y), containing the proportions of transcripts for each gene in the ambient solution.

groups

A vector of length equal to ncol(y), specifying the assigned group for each cell. This can also be a DataFrame, see ?sumCountsAcrossCells.

features

A vector of control features or a list of mutually exclusive feature sets, see ?ambientContribNegative for more details.

size.factors

Numeric scalar specifying the size factors for each column of y, defaults to library size-derived size factors.

dispersion

Numeric scalar specifying the dispersion to use in the quantile-quantile mapping.

sink

An optional RealizationSink object of the same dimensions as y.

BPPARAM

A BiocParallelParam object specifying how parallelization should be performed.

assay.type

Integer or string specifying the assay containing the count matrix.

Details

This function will aggregate counts from each group of related cells into an average profile. For each group, we estimate the contribution of the ambient profile and subtract it from the average. By default, this is done with ambientContribMaximum, but if enough is known about the biological system, users can specify feaures to use ambientContribNegative instead.

We then perform quantile-quantile mapping of counts in y from the old to new averages. This approach preserves the mean-variance relationship and improves the precision of estimate of the ambient contribution, but relies on a sensible grouping of similar cells, e.g., unsupervised clusters or cell type annotations. As such, this function is best used at the end of the analysis to clean up expression matrices prior to visualization.

Value

A numeric matrix-like object of the same dimensions as y, containing the counts after removing the ambient contamination. The exact representation of the output will depend on the class of y and whether sink was used.

Author(s)

Aaron Lun

See Also

ambientContribMaximum and ambientContribNegative, to estimate the ambient contribution.

estimateAmbience, to estimate the ambient profile.

The SoupX package, which provides another implementation of the same general approach.

Examples

# Making up some data.
ngenes <- 1000
ambient <- runif(ngenes, 0, 0.1)
cells <- c(runif(100) * 10, integer(900))
y <- matrix(rpois(ngenes * 100, cells + ambient), nrow=ngenes)

# Pretending that all cells are in one group, in this example.
removed <- removeAmbience(y, ambient, groups=rep(1, ncol(y)))
summary(rowMeans(removed[1:100,]))
summary(rowMeans(removed[101:1000,]))


MarioniLab/DropletUtils documentation built on Oct. 12, 2024, 5:40 p.m.