Normalization of NGS data

Description

Normalize quantitative NGS data in order to make counts comparable over samples. Scales each samples' reads such that the coverage is even for all samples after normalization.

Usage

1
2
normalizeGenome(X, normType = "poisson", sizeFactor = "mean", qu = 0.25,
  quSizeFactor = 0.75, ploidy)

Arguments

X

Matrix of positive real values, where columns are interpreted as samples and rows as genomic regions. An entry is the read count of a sample in the genomic region. Alternatively this can be a GRanges object containing the read counts as values.

normType

Type of the normalization technique. Each samples' read counts are scaled such that the total number of reads are comparable across samples. If this parameter is set to the value "mode", the read counts are scaled such that each samples' most frequent value (the "mode") is equal after normalization. Accordingly for the other options are "mean","median","poisson", "quant", and "mode". Default = "poisson".

sizeFactor

By this parameter one can decide to how the size factors are calculated. Possible choices are the the mean, median or mode coverage ("mean", "median", "mode") or any quantile ("quant").

qu

Quantile of the normType if normType is set to "quant" .Real value between 0 and 1. Default = 0.25.

quSizeFactor

Quantile of the sizeFactor if sizeFactor is set to "quant". 0.75 corresponds to "upper quartile normalization". Real value between 0 and 1. Default = 0.75.

ploidy

An integer value for each sample or each column in the read count matrix. At least two samples must have a ploidy of 2. Default = "missing".

Value

A data matrix of normalized read counts with the same dimensions as the input matrix X.

Author(s)

Guenter Klambauer klambauer@bioinf.jku.at

Examples

1
2

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.