perCellQCFilters: Compute filters for low-quality cells
In LTLA/scuttle: Single-Cell RNA-Seq Analysis Utilities

perCellQCFilters

R Documentation

Compute filters for low-quality cells

Description

Identifies low-quality cells as outliers for frequently used QC metrics.

Usage

perCellQCFilters(
  x,
  sum.field = "sum",
  detected.field = "detected",
  sub.fields = NULL,
  ...
)

Arguments

`x`	A DataFrame containing per-cell QC statistics, as computed by `perCellQCMetrics`.
`sum.field`	String specifying the column of `x` containing the library size for each cell.
`detected.field`	String specifying the column of `x` containing the number of detected features per cell.
`sub.fields`	Character vector specifying the column(s) of `x` containing the percentage of counts in subsets of “control features”, usually mitochondrial genes or spike-in transcripts. If set to `TRUE`, this will default to all columns in `x` with names following the patterns `"subsets_._percent"` and `"altexps_._percent"`.
`...`	Further arguments to pass to `isOutlier`.

Details

This function simply calls isOutlier on the various QC metrics in x.

For sum.field, small outliers are detected. These are considered to represent low-quality cells that have not been insufficiently sequenced. Detection is performed on the log-scale to adjust for a heavy right tail and to improve resolution at zero.
For detected.field, small outliers are detected. These are considered to represent low-quality cells with low-complexity libraries. Detection is performed on the log-scale to adjust for a heavy right tail. This is done on the log-scale to adjust for a heavy right tail and to improve resolution at zero.
For each column specified by sub.fields, large outliers are detected. This aims to remove cells with high spike-in or mitochondrial content, usually corresponding to damaged cells. While these distributions often have heavy right tails, the putative low-quality cells are often present in this tail; thus, transformation is not performed to ensure maintain resolution of the filter.

Users can control the outlier detection (e.g., change the number of MADs, specify batches) by passing appropriate arguments to ....

Value

A DataFrame with one row per cell and containing columns of logical vectors. Each column specifies a reason for why a cell was considered to be low quality, with the final discard column indicating whether the cell should be discarded.

Author(s)

Aaron Lun

Examples

example_sce <- mockSCE()
x <- perCellQCMetrics(example_sce, subsets=list(Mito=1:100))

discarded <- perCellQCFilters(x, 
    sub.fields=c("subsets_Mito_percent", "altexps_Spikes_percent"))
colSums(as.data.frame(discarded))

LTLA/scuttle documentation built on Oct. 28, 2024, 9:45 a.m.