fpkm: FPKM: fragments per kilobase per million mapped fragments

Description Usage Arguments Details Value See Also Examples

View source: R/helper.R

Description

The following function returns fragment counts normalized per kilobase of feature length per million mapped fragments (by default using a robust estimate of the library size, as in estimateSizeFactors).

Usage

1
fpkm(object, robust = TRUE)

Arguments

object

a DESeqDataSet

robust

whether to use size factors to normalize rather than taking the column sums of the raw counts, using the fpm function.

Details

The length of the features (e.g. genes) is calculated one of two ways: if there is a matrix named "avgTxLength" in assays(dds), this will take precedence in the length normalization. Otherwise, feature length is calculated from the rowRanges of the dds object, if a column basepairs is not present in mcols(dds). The calculated length is the number of basepairs in the union of all GRanges assigned to a given row of object, e.g., the union of all basepairs of exons of a given gene.

Note that, when the read/fragment counting has inter-feature dependencies, a strict normalization would not incorporate the basepairs of a feature which overlap another feature. This inter-feature dependence is not taken into consideration in the internal union basepair calculation.

Value

a matrix which is normalized per kilobase of the union of basepairs in the GRangesList or GRanges of the mcols(object), and per million of mapped fragments, either using the robust median ratio method (robust=TRUE, default) or using raw counts (robust=FALSE). Defining a column mcols(object)$basepairs takes precedence over internal calculation of the kilobases for each row.

See Also

fpm

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# create a matrix with 1 million counts for the
# 2nd and 3rd column, the 1st and 4th have
# half and double the counts, respectively.
m <- matrix(1e6 * rep(c(.125, .25, .25, .5), each=4),
            ncol=4, dimnames=list(1:4,1:4))
mode(m) <- "integer"
se <- SummarizedExperiment(m, colData=DataFrame(sample=1:4))
dds <- DESeqDataSet(se, ~ 1)

# create 4 GRanges with lengths: 1, 1, 2, 2.5 Kb
gr1 <- GRanges("chr1",IRanges(1,1000))
gr2 <- GRanges("chr1",IRanges(c(1,501),c(500,1000)))
gr3 <- GRanges("chr1",IRanges(c(1,1001),c(1000,2000)))
gr4 <- GRanges("chr1",IRanges(c(1,1001,2001),c(500,3000,3000)))
rowRanges(dds) <- GRangesList(gr1,gr2,gr3,gr4)

# the raw counts
counts(dds)

# the FPKM values
fpkm(dds)

# held constant per 1 million fragments
counts(dds) <- counts(dds) * 2L
round(fpkm(dds))

nlhuong/ZeroInflatedDESeq2 documentation built on May 23, 2019, 9:06 p.m.