equalizeLibSizes: Equalize Library Sizes by Quantile-to-Quantile Normalization

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/equalizeLibSizes.R

Description

Adjusts counts so that the effective library sizes are equal, preserving fold-changes between groups and preserving biological variability within each group.

Usage

1
2
3
4
5
## S3 method for class 'DGEList'
equalizeLibSizes(y, dispersion=NULL, ...)
## Default S3 method:
equalizeLibSizes(y, group=NULL, dispersion=NULL, 
            lib.size=NULL, ...)

Arguments

y

matrix of counts or a DGEList object.

dispersion

numeric scalar or vector of dispersion parameters. By default, is extracted from y or, if y contains no dispersion information, is set to 0.05.

group

vector or factor giving the experimental group/condition for each library.

lib.size

numeric vector giving the total count (sequence depth) for each library.

...

other arguments that are not currently used.

Details

Thus function implements the quantile-quantile normalization method of Robinson and Smyth (2008). It computes normalized counts, or pseudo-counts, used by exactTest and estimateCommonDisp.

The output pseudo-counts are the counts that would have theoretically arisen had the effective library sizes been equal for all samples. The pseudo-counts are computed in such as way as to preserve fold-change differences beween the groups defined by y$samples$group as well as biological variability within each group. Consequently, the results will depend on how the groups are defined.

Note that the column sums of the pseudo.counts matrix will not generally be equal, because the effective library sizes are not necessarily the same as actual library sizes and because the normalized pseudo counts are not equal to expected counts.

Value

equalizeLibSizes.DGEList returns a DGEList object with the following new components:

pseudo.counts

numeric matrix of normalized pseudo-counts

pseudo.lib.size

normalized library size

equalizeLibSizes.default returns a list with components pseudo.counts and pseudo.lib.size.

Note

This function is intended mainly for internal edgeR use. It is not normally called directly by users.

Author(s)

Mark Robinson, Davis McCarthy, Gordon Smyth

References

Robinson MD and Smyth GK (2008). Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics, 9, 321-332. http://biostatistics.oxfordjournals.org/content/9/2/321

See Also

q2qnbinom

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
ngenes <- 1000
nlibs <- 2
counts <- matrix(0,ngenes,nlibs)
colnames(counts) <- c("Sample1","Sample2")
counts[,1] <- rpois(ngenes,lambda=10)
counts[,2] <- rpois(ngenes,lambda=20)
summary(counts)
y <- DGEList(counts=counts)
out <- equalizeLibSizes(y)
summary(out$pseudo.counts)

Example output

Loading required package: limma
    Sample1          Sample2     
 Min.   : 2.000   Min.   : 8.00  
 1st Qu.: 8.000   1st Qu.:17.00  
 Median :10.000   Median :20.00  
 Mean   : 9.775   Mean   :20.03  
 3rd Qu.:12.000   3rd Qu.:23.00  
 Max.   :19.000   Max.   :35.00  
    Sample1          Sample2      
 Min.   : 3.614   Min.   : 5.194  
 1st Qu.:11.417   1st Qu.:11.867  
 Median :14.134   Median :13.876  
 Mean   :14.018   Mean   :13.974  
 3rd Qu.:16.963   3rd Qu.:16.053  
 Max.   :26.748   Max.   :24.674  

edgeR documentation built on Dec. 17, 2018, 6 p.m.