fpkm: Create normalizations of overlapping read counts.
In Roleren/ORFik: Open Reading Frames in Genomics

View source: R/riboseq_features.R

fpkm	R Documentation

Create normalizations of overlapping read counts.

Description

FPKM is short for "Fragments Per Kilobase of transcript per Million fragments in library". When calculating RiboSeq data FPKM over ORFs, use ORFs as 'grl'. When calculating RNASeq data FPKM, use full transcripts as 'grl'. It is equal to RPKM given that you do not have paired end reads.

Usage

fpkm(grl, reads, pseudoCount = 0, librarySize = "full", weight = 1L)

Arguments

`grl`	a `GRangesList` object can be either transcripts, 5' utrs, cds', 3' utrs or ORFs as a special case (uORFs, potential new cds' etc). If regions are not spliced you can send a `GRanges` object.
`reads`	a `GAlignments`, `GRanges` or `GRangesList` object, usually of RiboSeq, RnaSeq, CageSeq, etc.
`pseudoCount`	a numeric, default 0, set it to 1 if you want to avoid NA and inf values.
`librarySize`	either numeric value or character vector. Default ("full"), number of alignments in library (reads). If you just have a subset, you can give the value by librarySize = length(wholeLib) or sum(wholeLib$score), if you want lib size to be only number of reads overlapping grl, do: librarySize = "overlapping" sum(countOverlaps(reads, grl) > 0), if reads[1] has 3 hits in grl, and reads[2] has 2 hits, librarySize will be 2, not 5. You can also get the inverse overlap, if you want lib size to be total number of overlaps, do: librarySize = "DESeq" This is standard fpkm way of DESeq2::fpkm(robust = FALSE) sum(countOverlaps(grl, reads)) if grl[1] has 3 reads and grl[2] has 2 reads, librarySize is 5, not 2.
`weight`	a numeric/integer vector or metacolumn name. (default: 1L, no differential weighting). If weight is name of defined meta column in reads object, it gives the number of times a read was found at that position. GRanges("chr1", 1, "+", score = 5), would mean "score" column tells that this alignment region was found 5 times. if 1L it means each read is weighted equal as 1, this is what among others countOverlaps() presumes, if single number (!= 1), it repeats for all ranges, if vector with length > 1, it must be equal size of the reads object.

Details

Note also that you must consider if you will use the whole read library or just the reads overlapping 'grl' for library size. A normal question here is, does it make sense to include rRNA in library size ? If you only want overlapping grl, do: librarySize = "overlapping"

Value

a numeric vector with the fpkm values

References

doi: 10.1038/nbt.1621

Examples

ORF <- GRanges(seqnames = "1",
               ranges = IRanges(start = c(1, 10, 20),
               end = c(5, 15, 25)),
               strand = "+")
grl <- GRangesList(tx1_1 = ORF)
RFP <- GRanges("1", IRanges(25, 25),"+")
fpkm(grl, RFP)

# With weights (10 reads at position 25)
RFP <- GRanges("1", IRanges(25, 25),"+", score = 10)
fpkm(grl, RFP, weight = "score")

Roleren/ORFik documentation built on April 12, 2025, 5:31 a.m.