The following function returns fragment counts normalized
per kilobase of feature length per million mapped fragments
(by default using a robust estimate of the library size,
whether to use size factors to normalize
rather than taking the column sums of the raw counts,
The length of the features (e.g. genes) is calculated one of two ways:
(1) If there is a matrix named "avgTxLength" in
this will take precedence in the length normalization.
This occurs when using the tximport-DESeq2 pipeline.
(2) Otherwise, feature length is calculated
rowRanges of the dds object,
if a column
basepairs is not present in
The calculated length is the number of basepairs in the union of all
assigned to a given row of
the union of all basepairs of exons of a given gene.
Note that the second approach over-estimates the gene length
(average transcript length, weighted by abundance is a more appropriate
normalization for gene counts), and so the FPKM will be an underestimate of the true value.
Note that, when the read/fragment counting has inter-feature dependencies, a strict normalization would not incorporate the basepairs of a feature which overlap another feature. This inter-feature dependence is not taken into consideration in the internal union basepair calculation.
a matrix which is normalized per kilobase of the
union of basepairs in the
of the mcols(object), and per million of mapped fragments,
either using the robust median ratio method (robust=TRUE, default)
or using raw counts (robust=FALSE).
Defining a column
precedence over internal calculation of the kilobases for each row.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
# create a matrix with 1 million counts for the # 2nd and 3rd column, the 1st and 4th have # half and double the counts, respectively. m <- matrix(1e6 * rep(c(.125, .25, .25, .5), each=4), ncol=4, dimnames=list(1:4,1:4)) mode(m) <- "integer" se <- SummarizedExperiment(list(counts=m), colData=DataFrame(sample=1:4)) dds <- DESeqDataSet(se, ~ 1) # create 4 GRanges with lengths: 1, 1, 2, 2.5 Kb gr1 <- GRanges("chr1",IRanges(1,1000)) # 1kb gr2 <- GRanges("chr1",IRanges(c(1,1001),c( 500,1500))) # 1kb gr3 <- GRanges("chr1",IRanges(c(1,1001),c(1000,2000))) # 2kb gr4 <- GRanges("chr1",IRanges(c(1,1001),c(200,1300))) # 500bp rowRanges(dds) <- GRangesList(gr1,gr2,gr3,gr4) # the raw counts counts(dds) # the FPM values fpm(dds) # the FPKM values fpkm(dds)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.