normFactors: Scaling normalization across libraries

View source: R/normFactors.R

normFactorsR Documentation

Scaling normalization across libraries

Description

Calculate normalization factors using count data from multiple libraries.

Usage

normFactors(
  object,
  method = NULL,
  weighted = FALSE,
  ...,
  assay.id = "counts",
  se.out = TRUE
)

Arguments

object

A SummarizedExperiment object containing a count matrix and library sizes in the totals field of the colData.

Alternatively, a DGEList object containing a count matrix in object$counts and library sizes in object$samples$lib.size.

Alternatively, an ordinary matrix containing counts.

method

Deprecated argument, ignored.

weighted

A logical scalar indicating whether precision weights should be used for TMM normalization.

...

Other arguments to be passed to calcNormFactors.

assay.id

An integer scalar or string specifying the assay values to use for normalization.

se.out

A logical scalar indicating whether or not a SummarizedExperiment object should be returned.

Alternatively, a SummarizedExperiment or DGEList object in which normalization factors are to be stored.

Details

This function uses the trimmed mean of M-values (TMM) method to remove composition biases, typically in background regions of the genome. The key difference from standard TMM is that precision weighting is turned off by default so as to avoid upweighting high-abundance regions. These are more likely to be bound and thus more likely to be differentially bound. Assigning excessive weight to such regions will defeat the purpose of trimming when normalizing the coverage of background regions.

The normalization factors are always computed from object. However, if se.out is a (different) SummarizedExperiment object, these factors are stored in se.out and the modified object. This is useful when se.out contains counts for windows, but the normalization factors are computed using larger bins in object. The same logic applies when se.out is a (different) DGEList object.

Note that an error is raised if the library sizes in se.out are not identical to object$totals. This is because the normalization factors are only comparable when the library sizes are the same. Consistent library sizes can be achieved by using the same readParam object in windowCounts and related functions.

Value

If se.out=FALSE, a numeric vector containing the relative normalization factors for each library.

If se.out=TRUE, the same vector is stored in the norm.factors field of mcols(object) (if object is a SummarizedExperiment) or object$samples (if object is a DGEList) and the modified object is returned.

If se.out is a separate SummarizedExperiment or DGEList object, the normalization factors are stored inside se.out and the modified object is returned.

Author(s)

Aaron Lun

References

Robinson MD, Oshlack A (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology 11, R25.

See Also

calcNormFactors, for the base method.

normOffsets, for the trended normalization strategy.

Examples

counts <- matrix(rnbinom(400, mu=10, size=20), ncol=4)
data <- SummarizedExperiment(list(counts=counts))
data$totals <- colSums(counts)

# TMM normalization.
normFactors(data)


LTLA/csaw documentation built on Dec. 21, 2024, 1:10 a.m.