cpm: Counts per Million or Reads per Kilobase per Million

Description Usage Arguments Details Value Note Author(s) See Also Examples

View source: R/cpm.R

Description

Compute counts per million (CPM) or reads per kilobase per million (RPKM).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## S3 method for class 'DGEList'
cpm(y, normalized.lib.sizes = TRUE,
       log = FALSE, prior.count = 0.25, ...)
## Default S3 method:
cpm(y, lib.size = NULL,
       log = FALSE, prior.count = 0.25, ...)
## S3 method for class 'DGEList'
rpkm(y, gene.length = NULL, normalized.lib.sizes = TRUE,
       log = FALSE, prior.count = 0.25, ...)
## Default S3 method:
rpkm(y, gene.length, lib.size = NULL,
       log = FALSE, prior.count = 0.25, ...)
## S3 method for class 'DGEList'
cpmByGroup(y, group = NULL, dispersion = NULL, ...)
## Default S3 method:
cpmByGroup(y, group = NULL,
       dispersion = 0.05, offset = NULL, weights = NULL, ...)
## S3 method for class 'DGEList'
rpkmByGroup(y, group = NULL, gene.length = NULL, dispersion = NULL, ...)
## Default S3 method:
rpkmByGroup(y, group = NULL, gene.length,
       dispersion = 0.05, offset = NULL, weights = NULL, ...)

Arguments

y

matrix of counts or a DGEList object

normalized.lib.sizes

logical, use normalized library sizes?

lib.size

library size, defaults to colSums(y).

log

logical, if TRUE then log2 values are returned.

prior.count

average count to be added to each observation to avoid taking log of zero. Used only if log=TRUE.

gene.length

vector of length nrow(y) giving gene length in bases, or the name of the column y$genes containing the gene lengths.

group

factor giving group membership for columns of y. Defaults to y$sample$group for the DGEList method and to a single level factor for the default method.

dispersion

numeric vector of negative binomial dispersions.

offset

numeric matrix of same size as y giving offsets for the log-linear models. Can be a scalar or a vector of length ncol(y), in which case it is expanded out to a matrix.

weights

numeric vector or matrix of non-negative quantitative weights. Can be a vector of length equal to the number of libraries, or a matrix of the same size as y.

...

other arguments are not used.

Details

CPM or RPKM values are useful descriptive measures for the expression level of a gene. By default, the normalized library sizes are used in the computation for DGEList objects but simple column sums for matrices.

If log-values are computed, then a small count, given by prior.count but scaled to be proportional to the library size, is added to y to avoid taking the log of zero.

The rpkm method for DGEList objects will try to find the gene lengths in a column of y$genes called Length or length. Failing that, it will look for any column name containing "length" in any capitalization.

cpmByGroup and rpkmByGroup compute group average values on the unlogged scale.

Value

A numeric matrix of CPM or RPKM values. cpm and rpkm produce matrices of the same size as y. cpmByGroup and rpkmByGroup produce matrices with a column for each level of group. If log = TRUE, then the values are on the log2 scale.

Note

aveLogCPM(y), rowMeans(cpm(y,log=TRUE)) and log2(rowMeans(cpm(y)) all give slightly different results.

Author(s)

Davis McCarthy, Gordon Smyth

See Also

aveLogCPM

Examples

1
2
3
4
5
6
7
8
9
y <- matrix(rnbinom(20,size=1,mu=10),5,4)
cpm(y)

d <- DGEList(counts=y, lib.size=1001:1004)
cpm(d)
cpm(d,log=TRUE)

d$genes <- data.frame(Length=c(1000,2000,500,1500,3000))
rpkm(d)

Example output

Loading required package: limma
          [,1]      [,2]      [,3]     [,4]
[1,]  71428.57      0.00 134328.36 326315.8
[2,]      0.00  58823.53 477611.94 105263.2
[3,] 678571.43 411764.71 223880.60 294736.8
[4,]  71428.57 235294.12 134328.36 136842.1
[5,] 178571.43 294117.65  29850.75 136842.1
    Sample1  Sample2   Sample3   Sample4
1  1998.002    0.000  8973.081 30876.494
2     0.000  998.004 31904.287  9960.159
3 18981.019 6986.028 14955.135 27888.446
4  1998.002 3992.016  8973.081 12948.207
5  4995.005 4990.020  1994.018 12948.207
    Sample1   Sample2  Sample3  Sample4
1 11.133308  7.961463 13.17022 14.92511
2  7.961463 10.283967 14.97198 13.31691
3 14.230381 12.820139 13.89149 14.77950
4 11.133308 12.049603 13.17022 13.68727
5 12.355838 12.354466 11.13075 13.68727
    Sample1   Sample2    Sample3   Sample4
1  1998.002     0.000  8973.0808 30876.494
2     0.000   499.002 15952.1436  4980.080
3 37962.038 13972.056 29910.2692 55776.892
4  1332.001  2661.344  5982.0538  8632.138
5  1665.002  1663.340   664.6726  4316.069

edgeR documentation built on June 25, 2018, 6 p.m.