View source: R/compute_Features.R
computeFeaturesCage | R Documentation |
If you have a txdb with correctly reassigned transcripts, use: [computeFeatures()]
computeFeaturesCage(
grl,
RFP,
RNA = NULL,
Gtf = NULL,
tx = NULL,
fiveUTRs = NULL,
cds = NULL,
threeUTRs = NULL,
faFile = NULL,
riboStart = 26,
riboStop = 34,
sequenceFeatures = TRUE,
uorfFeatures = TRUE,
grl.is.sorted = FALSE,
weight.RFP = 1L,
weight.RNA = 1L
)
grl |
a |
RFP |
RiboSeq reads as |
RNA |
RnaSeq reads as |
Gtf |
a TxDb object of a gtf file or path to gtf, gff .sqlite etc. |
tx |
a GRangesList of transcripts, normally called from: exonsBy(Gtf, by = "tx", use.names = T) only add this if you are not including Gtf file If you are using CAGE, you do not need to reassign these to the cage peaks, it will do it for you. |
fiveUTRs |
fiveUTRs as GRangesList, if you used cage-data to extend 5' utrs, remember to input CAGE assigned version and not original! |
cds |
a GRangesList of coding sequences |
threeUTRs |
a GRangesList of transcript 3' utrs, normally called from: threeUTRsByTranscript(Gtf, use.names = T) |
faFile |
a path to fasta indexed genome, an open |
riboStart |
usually 26, the start of the floss interval, see ?floss |
riboStop |
usually 34, the end of the floss interval |
sequenceFeatures |
a logical, default TRUE, include all sequence features, that is: Kozak, fractionLengths, distORFCDS, isInFrame, isOverlapping and rankInTx. uorfFeatures = FALSE will remove the 4 last. |
uorfFeatures |
a logical, default TRUE, include all uORF sequence features, that is: distORFCDS, isInFrame, isOverlapping and rankInTx |
grl.is.sorted |
logical (F), a speed up if you know argument grl is sorted, set this to TRUE. |
weight.RFP |
a vector (default: 1L). Can also be character name of column in RFP. As in translationalEff(weight = "score") for: GRanges("chr1", 1, "+", score = 5), would mean score column tells that this alignment region was found 5 times. |
weight.RNA |
Same as weightRFP but for RNA weights. (default: 1L) |
A specialized version if you don't have a correct txdb, for example with CAGE reassigned leaders while txdb is not updated. It is 2x faster for tested data. The point of this function is to give you the ability to input transcript etc directly into the function, and not load them from txdb. Each feature have a link to an article describing feature, try ?floss
a data.table with scores, each column is one score type, name of columns are the names of the scores, i.g [floss()] or [fpkm()]
Other features:
computeFeatures()
,
countOverlapsW()
,
disengagementScore()
,
distToCds()
,
distToTSS()
,
entropy()
,
floss()
,
fpkm()
,
fpkm_calc()
,
fractionLength()
,
initiationScore()
,
insideOutsideORF()
,
isInFrame()
,
isOverlapping()
,
kozakSequenceScore()
,
orfScore()
,
rankOrder()
,
ribosomeReleaseScore()
,
ribosomeStallingScore()
,
startRegion()
,
startRegionCoverage()
,
stopRegion()
,
subsetCoverage()
,
translationalEff()
# a small example without cage-seq data:
# we will find ORFs in the 5' utrs
# and then calculate features on them
if (requireNamespace("BSgenome.Hsapiens.UCSC.hg19")) {
library(GenomicFeatures)
# Get the gtf txdb file
txdbFile <- system.file("extdata", "hg19_knownGene_sample.sqlite",
package = "GenomicFeatures")
txdb <- loadDb(txdbFile)
# Extract sequences of fiveUTRs.
fiveUTRs <- fiveUTRsByTranscript(txdb, use.names = TRUE)[1:10]
faFile <- BSgenome.Hsapiens.UCSC.hg19::Hsapiens
tx_seqs <- extractTranscriptSeqs(faFile, fiveUTRs)
# Find all ORFs on those transcripts and get their genomic coordinates
fiveUTR_ORFs <- findMapORFs(fiveUTRs, tx_seqs)
unlistedORFs <- unlistGrl(fiveUTR_ORFs)
# group GRanges by ORFs instead of Transcripts
fiveUTR_ORFs <- groupGRangesBy(unlistedORFs, unlistedORFs$names)
# make some toy ribo seq and rna seq data
starts <- unlistGrl(ORFik:::firstExonPerGroup(fiveUTR_ORFs))
RFP <- promoters(starts, upstream = 0, downstream = 1)
score(RFP) <- rep(29, length(RFP)) # the original read widths
# set RNA seq to duplicate transcripts
RNA <- unlistGrl(exonsBy(txdb, by = "tx", use.names = TRUE))
#ORFik:::computeFeaturesCage(grl = fiveUTR_ORFs, RFP = RFP,
# RNA = RNA, Gtf = txdb, faFile = faFile)
}
# See vignettes for more examples
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.