floss: Fragment Length Organization Similarity Score
In ORFik: Open Reading Frames in Genomics

Description Usage Arguments Details Value References See Also Examples

This feature is usually calcualted only for RiboSeq reads. For reads of width between 'start' and 'end', sum the fraction of RiboSeq reads (per read widths) that overlap ORFs and normalize by CDS read width fractions. So if all read length are width 34 in ORFs and CDS, value is 1. If width is 33 in ORFs and 34 in CDS, value is 0. If width is 33 in ORFs and 50/50 (33 and 34) in CDS, values will be 0.5 (for 33).

1	floss(grl, RFP, cds, start = 26, end = 34, weight = 1L)

`grl`	a `GRangesList` object can be either transcripts, 5' utrs, cds', 3' utrs or ORFs as a special case (uORFs, potential new cds' etc). If regions are not spliced you can send a `GRanges` object.
`RFP`	ribosomal footprints, given as `GAlignments` or `GRanges` object, must be already shifted and resized to the p-site. Requires a $size column with original read lengths.
`cds`	a `GRangesList` of coding sequences, cds has to have names as grl so that they can be matched
`start`	usually 26, the start of the floss interval (inclusive)
`end`	usually 34, the end of the floss interval (inclusive)
`weight`	a vector (default: 1L, if 1L it is identical to countOverlaps()), if single number (!= 1), it applies for all, if more than one must be equal size of 'reads'. else it must be the string name of a defined meta column in subject "reads", that gives number of times a read was found. GRanges("chr1", 1, "+", score = 5), would mean "score" column tells that this alignment region was found 5 times.

Pseudo explanation of the function:

1	SUM[start to stop]((grl[start:end][name]/grl) / (cds[start:end][name]/cds))

Where 'name' is transcript names.
Please read more in the article.

a vector of FLOSS of length same as grl, 0 means no RFP reads in range, 1 is perfect match.

doi: 10.1016/j.celrep.2014.07.045

Other features: computeFeaturesCage(), computeFeatures(), countOverlapsW(), disengagementScore(), distToCds(), distToTSS(), entropy(), fpkm_calc(), fpkm(), fractionLength(), initiationScore(), insideOutsideORF(), isInFrame(), isOverlapping(), kozakSequenceScore(), orfScore(), rankOrder(), ribosomeReleaseScore(), ribosomeStallingScore(), startRegionCoverage(), startRegion(), stopRegion(), subsetCoverage(), translationalEff()

ORF1 <- GRanges(seqnames = "1",
               ranges = IRanges(start = c(1, 12, 22),
               end = c(10, 20, 32)),
               strand = "+")
grl <- GRangesList(tx1_1 = ORF1)
# RFP is 1 width position based GRanges
RFP <- GRanges("1", IRanges(c(1, 25, 35, 38), width = 1), "+")
RFP$size <- c(28, 28, 28, 29) # original width in size col
cds <-  GRangesList(tx1 = GRanges("1", IRanges(35, 44), "+"))
# grl must have same names as cds + _1 etc, so that they can be matched.
floss(grl, RFP, cds)
# or change ribosome start/stop, more strict
floss(grl, RFP, cds, 28, 28)

# With repeated alignments in score column
ORF2 <- GRanges(seqnames = "1",
               ranges = IRanges(start = c(12, 22, 36),
               end = c(20, 32, 38)),
               strand = "+")
grl <- GRangesList(tx1_1 = ORF1, tx1_2 = ORF2)
score(RFP) <- c(5, 10, 5, 10)
floss(grl, RFP, cds, weight = "score")