gene_selection: Gene selection

Description Subsetting by row Filtering by mean Author(s)

Description

Details on how gene selection is performed in almost all scran functions.

Subsetting by row

For functions accepting some gene-by-cell matrix x, we can choose to perform calculations only on a subset of rows (i.e., genes) with the subset.row argument. This can be a logical, integer or character vector indicating the rows of x to use. If a character vector, it must contain the names of the rows in x. Future support will be added for more esoteric subsetting vectors like the Bioconductor Rle classes.

The output of running a function with subset.row will always be the same as the output of subsetting x beforehand and passing it into the function. However, it is often more efficient to use subset.row as we can avoid constructing an intermediate subsetted matrix. The same reasoning applies for any x that is a SingleCellExperiment object.

Filtering by mean

Some functions will have a min.mean argument to filter out low-abundance genes prior to processing. Depending on the function, the filter may be applied to the average library size-adjusted count computed by calculateAverage, the average log-count, or some other measure of abundance - see the documentation for each function for details.

Any filtering on min.mean is automatically intersected with a specified subset.row. For example, only subsetted genes that pass the filter are retained if subset.row is specified alongside min.mean.

Author(s)

Aaron Lun


scran documentation built on April 17, 2021, 6:09 p.m.