findMapORFs | R Documentation |
This function can map spliced ORFs. It finds ORFs on the sequences of interest, but returns relative positions to the positions of 'grl' argument. For example, 'grl' can be exons of known transcripts (with genomic coordinates), and 'seq' sequences of those transcripts, in that case, this function will return genomic coordinates of ORFs found on transcript sequences.
findMapORFs(
grl,
seqs,
startCodon = startDefinition(1),
stopCodon = stopDefinition(1),
longestORF = TRUE,
minimumLength = 0,
groupByTx = FALSE,
grl_is_sorted = FALSE
)
grl |
A |
seqs |
(DNAStringSet or character vector) - DNA/RNA sequences to search
for Open Reading Frames. Can be both uppercase or lowercase. Easiest call to
get seqs if you want only regions from a fasta/fasta index pair is:
seqs = ORFik:::txSeqsFromFa(grl, faFile), where grl is a GRanges/List of
search regions and faFile is a |
startCodon |
(character vector) Possible START codons to search for.
Check |
stopCodon |
(character vector) Possible STOP codons to search for.
Check |
longestORF |
(logical) Default TRUE. Keep only the longest ORF per
unique stopcodon: (seqname, strand, stopcodon) combination, Note: Not longest
per transcript! You can also use function
|
minimumLength |
(integer) Default is 0. Which is START + STOP = 6 bp. Minimum length of ORF, without counting 3bps for START and STOP codons. For example minimumLength = 8 will result in size of ORFs to be at least START + 8*3 (bp) + STOP = 30 bases. Use this param to restrict search. |
groupByTx |
logical (default: FALSE), should output GRangesList be grouped by exons per ORF (TRUE) or by orfs per transcript (FALSE)? |
grl_is_sorted |
logical, default FALSE If FALSE will sort negative transcript in descending order for you. If you loaded ranges with default methods this is already the case, so you can set to TRUE to save some time. |
This function assumes that 'seq' is in widths relative to 'grl', and that their orders match. 1st seq is 1st grl object, etc.
See vignette for real life example.
A GRangesList of ORFs.
Other findORFs:
findORFs()
,
findORFsFasta()
,
findUORFs()
,
startDefinition()
,
stopDefinition()
# First show simple example using findORFs
# This sequence has ORFs at 1-9 and 4-9
seqs <- DNAStringSet("ATGATGTAA") # the dna transcript sequence
findORFs(seqs)
# lets assume that this sequence comes from two exons as follows
# Then we need to use findMapORFs instead of findORFs,
# for splicing information
gr <- GRanges(seqnames = "1", # chromosome 1
ranges = IRanges(start = c(21, 10), end = c(23, 15)),
strand = "-", #
names = "tx1") #From transcript 1 on chr 1
grl <- GRangesList(tx1 = gr) # 1 transcript with 2 exons
findMapORFs(grl, seqs) # ORFs are properly mapped to its genomic coordinates
grl <- c(grl, grl)
names(grl) <- c("tx1", "tx2")
findMapORFs(grl, c(seqs, seqs))
# More advanced example and how to save sequences found in vignette
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.