'MedianNorm' specifies the median-by-ratio normalization function from Anders et. al., 2010.

1 | ```
MedianNorm(Data, alternative = FALSE)
``` |

`Data` |
The data matrix with transcripts in rows and lanes in columns. |

`alternative` |
if alternative = TRUE, the alternative version of median normalization will be applied. The alternative method is similar to median-by-ratio normalization, but can deal with the cases when all of the genes/isoforms have at least one zero counts (in which case the median-by-ratio normalization will fail). In more details, in median-by-ratio normalization (denote l_1 as libsize for sample 1 as an example, assume total S samples): hatl_1 = median_g [ X_g1 / (X_g1*X_g2*...*X_gS)^-S ] (1) which estimates l_1 / (l_1 * l_2 * ... * l_S)^-S. Since we have the constrain that (l_1 * l_2 * ... * l_S) = 1, equation (1) estimates l_1. Note (1) could also be written as: hatl_1 = median_g [ (X_g1/X_g1 * X_g1/X_g2 * .... * X_g1/X_gS)^-S] In the alternative method, we estimate l_1/l_1, l_1/l_2, ... l_1/l_S individually by taking median_g(X_g1/X_g1), median_g(X_g1/X_g2) ... Then estimate l_1 = l_1 / (l_1 * l_2 * ... * l_S)^-S by taking the geomean of these estimates: hatl_1 = [ median_g(X_g1/X_g1) * median_g(X_g1/X_g2) * median_g(X_g1/X_g3) * ... * median_g(X_g1/X_gS) ] ^-S |

The function will return a vector contains the normalization factor for each lane.

Ning Leng

Simon Anders and Wolfgang Huber. Differential expression analysis for sequence count data. Genome Biology (2010) 11:R106 (open access)

QuantileNorm

1 2 3 4 5 | ```
data(GeneMat)
Sizes = MedianNorm(GeneMat)
#EBOut = EBTest(Data = GeneMat,
# Conditions = as.factor(rep(c("C1","C2"), each=5)),
# sizeFactors = Sizes, maxround = 5)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.