Median Normalization

Description

'MedianNorm' specifies the median-by-ratio normalization function from Anders et. al., 2010.

Usage

1
MedianNorm(Data, alternative = FALSE)

Arguments

Data

The data matrix with transcripts in rows and lanes in columns.

alternative

if alternative = TRUE, the alternative version of median normalization will be applied. The alternative method is similar to median-by-ratio normalization, but can deal with the cases when all of the genes/isoforms have at least one zero counts (in which case the median-by-ratio normalization will fail).

In more details, in median-by-ratio normalization (denote l_1 as libsize for sample 1 as an example, assume total S samples):

hatl_1 = median_g [ X_g1 / (X_g1*X_g2*...*X_gS)^-S ] (1)

which estimates l_1 / (l_1 * l_2 * ... * l_S)^-S. Since we have the constrain that (l_1 * l_2 * ... * l_S) = 1, equation (1) estimates l_1. Note (1) could also be written as:

hatl_1 = median_g [ (X_g1/X_g1 * X_g1/X_g2 * .... * X_g1/X_gS)^-S]

In the alternative method, we estimate l_1/l_1, l_1/l_2, ... l_1/l_S individually by taking median_g(X_g1/X_g1), median_g(X_g1/X_g2) ... Then estimate l_1 = l_1 / (l_1 * l_2 * ... * l_S)^-S by taking the geomean of these estimates:

hatl_1 = [ median_g(X_g1/X_g1) * median_g(X_g1/X_g2) * median_g(X_g1/X_g3) * ... * median_g(X_g1/X_gS) ] ^-S

Value

The function will return a vector contains the normalization factor for each lane.

Author(s)

Ning Leng

References

Simon Anders and Wolfgang Huber. Differential expression analysis for sequence count data. Genome Biology (2010) 11:R106 (open access)

See Also

QuantileNorm

Examples

1
2
3
4
5
data(GeneMat)
Sizes = MedianNorm(GeneMat)
#EBOut = EBTest(Data = GeneMat,
#	Conditions = as.factor(rep(c("C1","C2"), each=5)),
#	sizeFactors = Sizes, maxround = 5)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.