librarySizeFactors: Compute library size factors

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Define per-cell size factors from the library sizes (i.e., total sum of counts per cell).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
librarySizeFactors(x, ...)

## S4 method for signature 'ANY'
librarySizeFactors(
  x,
  subset.row = NULL,
  geometric = FALSE,
  BPPARAM = SerialParam(),
  subset_row = NULL,
  pseudo_count = 1
)

## S4 method for signature 'SummarizedExperiment'
librarySizeFactors(x, ..., assay.type = "counts", exprs_values = NULL)

computeLibraryFactors(x, ...)

Arguments

x

For librarySizeFactors, a numeric matrix of counts with one row per feature and column per cell. Alternatively, a SummarizedExperiment or SingleCellExperiment containing such counts.

For computeLibraryFactors, only a SingleCellExperiment containing a count matrix is accepted.

...

For the librarySizeFactors generic, arguments to pass to specific methods. For the SummarizedExperiment method, further arguments to pass to the ANY method.

For computeLibraryFactors, further arguments to pass to librarySizeFactors.

subset.row

A vector specifying whether the size factors should be computed from a subset of rows of x.

geometric

Deprecated, logical scalar indicating whether the size factor should be defined using the geometric mean.

BPPARAM

A BiocParallelParam object indicating how calculations are to be parallelized. Only relevant when x is a DelayedArray object.

subset_row, exprs_values

Soft-deprecated equivalents to the arguments above.

pseudo_count

Deprecated, numeric scalar specifying the pseudo-count to add when geometric=TRUE.

assay.type

String or integer scalar indicating the assay of x containing the counts.

Details

Library sizes are converted into size factors by scaling them so that their mean across cells is unity. This ensures that the normalized values are still on the same scale as the raw counts. Preserving the scale is useful for interpretation of operations on the normalized values, e.g., the pseudo-count used in logNormCounts can actually be considered an additional read/UMI. This is important for ensuring that the effect of the pseudo-count decreases with increasing sequencing depth, see ?normalizeCounts for a discussion of this effect.

With library size-derived size factors, we implicitly assume that sequencing coverage is the only difference between cells. This is reasonable for homogeneous cell populations but is compromised by composition biases from DE between cell types. In such cases, the library size factors will not be correct though any effects on downstream conclusions will vary, e.g., clustering is usually unaffected by composition biases but log-fold change estimates will be less accurate.

Value

For librarySizeFactors, a numeric vector of size factors is returned for all methods.

For computeLibraryFactors, x is returned containing the size factors in sizeFactors(x).

Author(s)

Aaron Lun

See Also

normalizeCounts and logNormCounts, where these size factors are used by default.

geometricSizeFactors and medianSizeFactors, for two other simple methods of computing size factors.

Examples

1
2
example_sce <- mockSCE()
summary(librarySizeFactors(example_sce))

scuttle documentation built on Dec. 19, 2020, 2 a.m.