aveLogCPM: Average Log Counts Per Million

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/aveLogCPM.R

Description

Compute average log2 counts-per-million for each row of counts.

Usage

1
2
3
4
5
## S3 method for class 'DGEList'
aveLogCPM(y, normalized.lib.sizes=TRUE, prior.count=2, dispersion=NULL, ...)
## Default S3 method:
aveLogCPM(y, lib.size=NULL, offset=NULL, prior.count=2, dispersion=NULL,
          weights=NULL, ...)

Arguments

y

numeric matrix containing counts. Rows for genes and columns for libraries.

normalized.lib.sizes

logical, use normalized library sizes?

prior.count

numeric scalar or vector of length nrow(y), containing the average value(s) to be added to each count to avoid infinite values on the log-scale.

dispersion

numeric scalar or vector of negative-binomial dispersions. Defaults to 0.05.

lib.size

numeric vector of library sizes. Defaults to colSums(y). Ignored if offset is not NULL.

offset

numeric matrix of offsets for the log-linear models.

weights

optional numeric matrix of observation weights.

...

other arguments are not currently used.

Details

This function uses mglmOneGroup to compute average counts-per-million (AveCPM) for each row of counts, and returns log2(AveCPM). An average value of prior.count is added to the counts before running mglmOneGroup. If prior.count is a vector, each entry will be added to all counts in the corresponding row of y, as described in addPriorCount.

This function is similar to

log2(rowMeans(cpm(y, ...))),

but with the refinement that larger library sizes are given more weight in the average. The two versions will agree for large values of the dispersion.

Value

Numeric vector giving log2(AveCPM) for each row of y.

Author(s)

Gordon Smyth

See Also

See cpm for individual logCPM values, rather than genewise averages.

Addition of the prior count is performed using the strategy described in addPriorCount.

The computations for aveLogCPM are done by mglmOneGroup.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
y <- matrix(c(0,100,30,40),2,2)
lib.size <- c(1000,10000)

# With disp large, the function is equivalent to row-wise averages of individual cpms:
aveLogCPM(y, dispersion=1e4)
cpm(y, log=TRUE, prior.count=2)

# With disp=0, the function is equivalent to pooling the counts before dividing by lib.size:
aveLogCPM(y,prior.count=0,dispersion=0)
cpms <- rowSums(y)/sum(lib.size)*1e6
log2(cpms)

# The function works perfectly with prior.count or dispersion vectors:
aveLogCPM(y, prior.count=runif(nrow(y), 1, 5))
aveLogCPM(y, dispersion=runif(nrow(y), 0, 0.2))

Example output

Loading required package: limma
[1] 17.79314 19.55987
         [,1]     [,2]
[1,] 14.45584 18.71994
[2,] 19.89878 19.11609
[1] 17.42907 19.65146
[1] 11.41324 13.63564
[1] 17.66298 19.54732
[1] 17.69110 19.56458

edgeR documentation built on Nov. 17, 2017, 9:48 a.m.