withinLaneNormalization-methods: Methods for Function 'withinLaneNormalization' in Package...
In EDASeq: Exploratory Data Analysis and Normalization for RNA-Seq

Description Usage Arguments Details Methods Author(s) References Examples

Within-lane normalization for GC-content (or other lane-specific) bias.

1	withinLaneNormalization(x, y, which=c("loess","median","upper","full"), offset=FALSE, num.bins=10, round=TRUE)

`x`	A numeric matrix representing the counts or a `SeqExpressionSet` object.
`y`	A numeric vector representing the covariate to normalize for (if `x` is a matrix) or a character vector with the name of the covariate (if `x` is a `SeqExpressionSet` object). Usually it is the GC-content.
`which`	Method used to normalized. See the details section and the reference below for details.
`offset`	Should the normalized value be returned as an offset leaving the original counts unchanged?
`num.bins`	The number of bins used to stratify the covariate for `median`, `upper` and `full` methods. Ignored if `loess`. See the reference for a discussion on the number of bins.
`round`	If TRUE the normalization returns rounded values (pseudo-counts). Ignored if offset=TRUE.

This method implements four normalizations described in Risso et al. (2011).

The loess normalization transforms the data by regressing the counts on y and subtracting the loess fit from the counts to remove the dependence.

The median, upper and full normalizations are based on the stratification of the genes based on y. Once the genes are stratified in num.bins strata, the methods work as follows.

median:: scales the data to have the same median in each bin.
upper:: the same but with the upper quartile.
full:: forces the distribution of each stratum to be the same using a non linear full quantile normalization, in the spirit of the one used in microarrays.

signature(x = "matrix", y = "numeric"): It returns a matrix with the normalized counts if offset=FALSE or with the offset if offset=TRUE.
signature(x = "SeqExpressionSet", y = "character"): It returns a SeqExpressionSet with the normalized counts in the normalizedCounts slot and with the offset in the offset slot (if offset=TRUE).

Davide Risso.

D. Risso, K. Schwartz, G. Sherlock and S. Dudoit (2011). GC-Content Normalization for RNA-Seq Data. Manuscript in Preparation.

library(yeastRNASeq)
data(geneLevelData)
data(yeastGC)

sub <- intersect(rownames(geneLevelData), names(yeastGC))

mat <- as.matrix(geneLevelData[sub, ])

data <- newSeqExpressionSet(mat,
                            phenoData=AnnotatedDataFrame(
                                      data.frame(conditions=factor(c("mut", "mut", "wt", "wt")),
                                                 row.names=colnames(geneLevelData))),
                            featureData=AnnotatedDataFrame(data.frame(gc=yeastGC[sub])))

norm <- withinLaneNormalization(data, "gc", which="full", offset=FALSE)