normalize: Normalize a SingleCellExperiment object using pre-computed...

Description Usage Arguments Details Value Author(s) Examples

Description

Compute normalized expression values from count data in a SingleCellExperiment object, using the size factors stored in the object. This function is now deprecated, use logNormCounts instead.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
normalizeSCE(
  object,
  exprs_values = "counts",
  return_log = TRUE,
  log_exprs_offset = NULL,
  centre_size_factors = TRUE,
  preserve_zeroes = FALSE
)

## S4 method for signature 'SingleCellExperiment'
normalize(
  object,
  exprs_values = "counts",
  return_log = TRUE,
  log_exprs_offset = NULL,
  centre_size_factors = TRUE,
  preserve_zeroes = FALSE
)

Arguments

object

A SingleCellExperiment object.

exprs_values

String indicating which assay contains the count data that should be used to compute log-transformed expression values.

return_log

Logical scalar, should normalized values be returned on the log2 scale? If TRUE, output is stored as "logcounts" in the returned object; if FALSE output is stored as "normcounts".

log_exprs_offset

Numeric scalar specifying the pseudo-count to add when log-transforming expression values. If NULL, the value is taken from metadata(object)$log.exprs.offset if defined, otherwise it is set to 1.

centre_size_factors

Logical scalar indicating whether size fators should be centred.

preserve_zeroes

Logical scalar indicating whether zeroes should be preserved when dealing with non-unity offsets.

Details

Normalized expression values are computed by dividing the counts for each cell by the size factor for that cell. This aims to remove cell-specific scaling biases, e.g., due to differences in sequencing coverage or capture efficiency. If log=TRUE, log-normalized values are calculated by adding log_exprs_offset to the normalized count and performing a log2 transformation.

Features marked as spike-in controls will be normalized with control-specific size factors, if these are available. This reflects the fact that spike-in controls are subject to different biases than those that are removed by gene-specific size factors (namely, total RNA content). If size factors for a particular spike-in set are not available, a warning will be raised.

If centre_size_factors=TRUE, all sets of size factors will be centred to have the same mean prior to calculation of normalized expression values. This ensures that abundances are roughly comparable between features normalized with different sets of size factors. By default, the centre mean is unity, which means that the computed exprs can be interpreted as being on the same scale as log-counts. It also means that the added log_exprs_offset can be interpreted as a pseudo-count (i.e., on the same scale as the counts).

If preserve_zeroes=TRUE and the pseudo-count is not unity, size factors are instead centered at the specified value of log_exprs_offset. The log-transformation is then performed on the normalized expression values with a pseudo-count of 1, which ensures that zeroes remain so in the output matrix. This yields the same results as preserve_zeroes=FALSE minus a matrix-wide constant of log2(log_exprs_offset).

In some cases, the function will return a DelayedMatrix with delayed division and log-transformation operations. This requires that the assay specified by exprs_values contains a DelayedMatrix, and only one set of size factors is used for all features. This avoids the need to explicitly calculate normalized expression values across a very large (possibly file-backed) matrix.

Value

A SingleCellExperiment object containing normalized expression values in "normcounts" if log=FALSE, and log-normalized expression values in "logcounts" if log=TRUE. All size factors will also be centred in the output object if centre_size_factors=TRUE.

Author(s)

Davis McCarthy and Aaron Lun

Examples

1
2
example_sce <- mockSCE()
example_sce <- normalize(example_sce)

scater documentation built on Dec. 18, 2019, 2:05 a.m.