computeMLC: Computation of the most-like CDS region of an RNA sequence

View source: R/Sequence.R

computeMLCR Documentation

Computation of the most-like CDS region of an RNA sequence

Description

This function compute the most-like CDS (MLC) region of one RNA. Methods based on the longest open reading frame (ORF) and maximum subarray sum (MSS) are supported.

Usage

computeMLC(oneRNA, mode = c("ORF", "MSS"))

Arguments

oneRNA

one RNA loaded by function read.fasta from seqinr-package. Or a list of one RNA sequence. The sequence should be a vector of single characters.

mode

can be "ORF" (compute the longest open reading frame) and/or "MSS" (compute subsequence having the maximum subarray sum of hexamer score).

Value

A data frame. The MLC region, length and coverage of the MLC will be returned. MSS-based method is based on [1] and [2]. ORF extraction function is borrowed from our previous work [3].

References

[1] Yang C, Yang L, Zhou M, et al. LncADeep: an ab initio lncRNA identification and functional annotation tool based on deep learning. Bioinformatics. 2018; 34(22):3825-3834.

[2] Sun L, Luo H, Bu D, et al. Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts. Nucleic acids research. 2013; 41(17):e166-e166.

[3] Han S, Liang Y, Ma Q, et al. LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property. Briefings in bioinformatics. 2019; 20(6):2009-2027.

Examples


# Use "read.fasta" function of package "seqinr" to read a FASTA file:

seqRNA <- seqinr::read.fasta(file =
"http://www.ncbi.nlm.nih.gov/WebSub/html/help/sample_files/nucleotide-sample.txt")

# Compute the MLC region using both "ORF" and "MSS" methods:

MLC_list <- lapply(seqRNA, computeMLC, mode = c("ORF", "MSS"))


HAN-Siyu/ncProR documentation built on Nov. 3, 2023, 12:08 a.m.