ambientContribNegative: Ambient contribution from negative controls

ambientContribNegativeR Documentation

Ambient contribution from negative controls

Description

Estimate the contribution of the ambient solution to a particular expression profile, based on the abundance of negative control features that should not be expressed in the latter.

Usage

controlAmbience(...)

ambientContribNegative(y, ...)

## S4 method for signature 'ANY'
ambientContribNegative(
  y,
  ambient,
  features,
  mode = c("scale", "profile", "proportion")
)

## S4 method for signature 'SummarizedExperiment'
ambientContribNegative(y, ..., assay.type = "counts")

Arguments

...

For the generic, further arguments to pass to individual methods.

For the SummarizedExperiment method, further arguments to pass to the ANY method.

For controlAmbience, arguments to pass to ambientContribNegative.

y

A numeric matrix-like object containing counts, where each row represents a feature (e.g., a gene or a conjugated tag) and each column represents either a cell or group of cells.

Alternatively, a SummarizedExperiment object containing such a matrix.

y can also be a numeric vector of counts; this is coerced into a one-column matrix.

ambient

A numeric vector of length equal to nrow(y), containing the proportions of transcripts for each gene in the ambient solution. Alternatively, a matrix where each row corresponds to a row of y and each column contains a specific ambient profile for the corresponding column of y.

features

A logical, integer or character vector specifying the negative control features in y and ambient.

Alternatively, a list of vectors specifying mutually exclusive sets of features.

mode

String indicating the output to return, see Value.

assay.type

Integer or string specifying the assay containing the count matrix.

Details

Negative control features should be those that cannot be expressed and thus fully attributable to ambient contamination. This is most commonly determined a priori from the biological context and experimental system. For example, if spike-ins were introduced into the solution prior to cell capture, these would serve as a gold standard for ambient contamination in y. For single-nuclei sequencing, mitochondrial transcripts can serve a similar role under the assumption that all high-quality libraries are stripped nuclei.

If features is a list, it is expected to contain multiple sets of mutually exclusive features. Each cell should only express features in at most one set; no cell should express features in different sets. The expression of multiple sets can thus be attributed to ambient contamination. For this mode, an archetypal pairing is that of hemoglobins with immunoglobulins (Young and Behjati, 2018), which should not be co-expressed in any (known) cell type.

controlAmbience is soft-deprecated; use ambientContribNegative instead.

Value

If mode="scale", a numeric vector is returned quantifying the estimated “contribution” of the ambient solution to each column of y. Scaling ambient by each entry yields the maximum ambient profile for the corresponding column of y.

If mode="profile", a numeric matrix is returned containing the estimated ambient profile for each column of y. This is computed by scaling as described above; if ambient is a matrix, each column is scaled by the corresponding entry of the scaling vector.

If mode="proportion", a numeric matrix is returned containing the estimated proportion of counts in y that are attributable to ambient contamination. This is computed by simply dividing the output of mode="profile" by y and capping all values at 1.

Author(s)

Aaron Lun

References

Young MD and Behjati S (2018). SoupX removes ambient RNA contamination from droplet based single-cell RNA sequencing data. biorXiv.

See Also

ambientProfileEmpty or ambientProfileBimodal, to obtain a profile estimate to use in ambient.

ambientContribMaximum or ambientContribSparse, for other methods of estimating contribution when negative control features are not available.

Examples

# Making up some data.
ambient <- c(runif(900, 0, 0.1), runif(100))
y <- rpois(1000, ambient * 50)
y <- y + c(integer(100), rpois(900, 5)) # actual biology, but first 100 genes silent.

# Using the first 100 genes as negative controls:
scaling <- ambientContribNegative(y, ambient, features=1:100)
scaling

# Estimating the negative control contribution to 'y' by 'ambient'.
contribution <- ambientContribNegative(y, ambient, features=1:100, mode="profile")
DataFrame(ambient=drop(contribution), total=y)


MarioniLab/DropletUtils documentation built on March 14, 2024, 11:04 p.m.