predORF: Predict ORFs
In tgirke/systemPipeR: systemPipeR: Workflow Environment for Data Analysis and Report Generation

predORF

R Documentation

Predict ORFs

Description

Predicts open reading frames (ORFs) and coding sequences (CDSs) in DNA sequences provided as DNAString or DNAStringSet objects.

Usage

predORF(x, n = 1, type = "grl", mode = "orf", strand = "sense", longest_disjoint=FALSE, startcodon = "ATG", stopcodon = c("TAA", "TAG", "TGA"))

Arguments

`x`	DNA query sequence(s) provided as `DNAString` or `DNAStringSet` object.
`n`	Defines the maximum number of ORFs to return for each input sequence. The ORFs identified are sorted decreasingly by their length. For instance, `n=1` (default) returns the longest ORF, `n=2` the two longest ones, and so on.
`type`	One of three options provided as character values: `'df'` returns results as `data.frame`, while `'gr'` and `'grl'` (default) return them as `GRanges` or `GRangesList` objects, respectively.
`mode`	The setting `mode='ORF'` returns a continuous reading frame that begins with a start codon and ends with a stop codon. The setting `mode='CDS'` return continuous reading frames that do not need to begin or end with start or stop codons, respectively.
`strand`	One of three options passed on as character vector of length one: `'sense'` performs the predictions only for the sense strand of the query sequence(s), `'antisense'` does it only for the antisense strand and `'both'` does it for both strands.
`longest_disjoint`	If set to `TRUE` and `n='all'`, the results will be subsetted to non-overlapping ORF set containing longest ORF.
`startcodon`	Defines the start codon(s) for ORF predictions. The default is set to the standard start codon 'ATG'. Any custom set of triplet DNA sequences can be assigned here.
`stopcodon`	Defines the stop codon(s) for ORF predictions. The default is set to the three standard stop codons 'TAA', 'TAG' and 'TGA'. Any custom set of triplet DNA sequences can be assigned here.

Value

Returns ORF/CDS ranges identified in query sequences as GRanges or data.frame object. The type argument defines which one of them will be returned. The objects contain the following columns:

seqnames: names of query sequences
subject_id: identified ORF/CDS ranges numbered by query
start/end: start and end positions of ORF/CDS ranges
strand: strand of query sequence used for prediction
width: length of subject range in bases
inframe2end: frame of identified ORF/CDS relative to 3' end of query sequence. This can be important if the query sequence was extracted directly upstream of an ORF (e.g. 5' UTR upstream of main ORF). The value 1 stands for in-frame with downstream ORF, while 2 or 3 indicates a shift of one or two bases, respectively.

Author(s)

Thomas Girke

Examples

## Load DNA sample data set from Biostrings package
file <- system.file("extdata", "someORF.fa", package="Biostrings")
dna <- readDNAStringSet(file)

## Predict longest ORF for sense strand in each query sequence
(orf <- predORF(dna[1:4], n=1, type="gr", mode="orf", strand="sense"))

## Not run: 
## Usage for more complex example
library(txdbmaker); library(systemPipeRdata)
gff <- system.file("extdata/annotation", "tair10.gff", package="systemPipeRdata")
txdb <- makeTxDbFromGFF(file=gff, format="gff3", organism="Arabidopsis")
futr <- fiveUTRsByTranscript(txdb, use.names=TRUE)
genome <- system.file("extdata/annotation", "tair10.fasta", package="systemPipeRdata")
dna <- extractTranscriptSeqs(FaFile(genome), futr)
uorf <- predORF(dna, n="all", mode="orf", longest_disjoint=TRUE, strand="sense")
grl_scaled <- scaleRanges(subject=futr, query=uorf, type="uORF", verbose=TRUE)
export.gff3(unlist(grl_scaled), "uorf.gff")

## End(Not run)

tgirke/systemPipeR documentation built on June 13, 2025, 1:38 p.m.

tgirke/systemPipeR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tgirke/systemPipeR
systemPipeR: Workflow Environment for Data Analysis and Report Generation

predORF: Predict ORFs
In tgirke/systemPipeR: systemPipeR: Workflow Environment for Data Analysis and Report Generation

Predict ORFs

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Related to predORF in tgirke/systemPipeR...

R Package Documentation

Browse R Packages

We want your feedback!

tgirke/systemPipeR systemPipeR: Workflow Environment for Data Analysis and Report Generation

predORF: Predict ORFs In tgirke/systemPipeR: systemPipeR: Workflow Environment for Data Analysis and Report Generation

Predict ORFs

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Related to predORF in tgirke/systemPipeR...

R Package Documentation

Browse R Packages

We want your feedback!

tgirke/systemPipeR
systemPipeR: Workflow Environment for Data Analysis and Report Generation

predORF: Predict ORFs
In tgirke/systemPipeR: systemPipeR: Workflow Environment for Data Analysis and Report Generation