read_annotation: read_annotation

View source: R/MODULE_1_CELP_BIAS.R

read_annotationR Documentation

read_annotation

Description

Function to create an annotation data table from a txt file.

Usage

read_annotation(annotation_file)

Arguments

annotation_file

An annotation txt file listing transcript names and lengths of their 5'UTR, CDS and 3'UTR segments. The annotation table has five columns: transcript, l_tr, l_utr5, l_cds and l_utr3.

Details

Several Ribolog functions use the transcript annotation table. read_annotation reads the annotation from a source file and stores it in a data table object which can be subsequently used by other functions. Transcript names and segment lengths in the annotation must correspond to the reference sequences to which the reads were mapped. We recommend creating the annotation txt file directly from the fasta file of cDNA sequences using the python script Biomart_cDNA_fasta_to_rW_annotation_and_reheadered_longest_CDS_cDNA_fasta.py provided with the Ribolog package.

Note

Fasta files of cDNA sequences (only one transcript per gene with the longest CDS) and annotation tables for human and a number of popular model organisms (Saccharomyces cerevisiae yeast, Zea mays maize, Arabidopsis thaliana thale cress, Drosophila melanogaster fruit fly, Caenorhabditis elegans worm, Danio rerio zebra fish, Mus musculus mouse and Rattus norvegicus rat) are included with the Ribolog package.

Examples

annotation_human_cDNA <- read_annotation("<file.path>/Human.GRC38.96_annotation.txt")

goodarzilab/Ribolog documentation built on Oct. 7, 2022, 10:14 p.m.