annotDT2exonList: Convert a data.table of gene annotations to a list of exon...

View source: R/annotDT2exonList.R

annotDT2exonListR Documentation

Convert a data.table of gene annotations to a list of exon sequences

Description

A data.table with information on the start and end of exons is used to generate a list containing each exon as an indexed item. The indexes exon sequences are in the order and orientation that they would be transcribed in. That is, they can be pasted together and translated to form the protein sequence. Note, this function has been designed to work for a single gene on a single chromosome. If you want to run multiple genes, you will need to loop the function.

Usage

annotDT2exonList(
  annotDT,
  genomeSeq,
  chromCol = "CHROM",
  startCol = "START",
  endCol = "END",
  strandCol = "STRAND"
)

Arguments

annotDT

Data.table: Contains information on the exon positions. Requires the following columns:

  1. The chromosome ID (see param chromCol).

  2. The start position of the exon on the positive strand, that is, left to right on the genomic sequence (see param startCol).

  3. The end position of the exon on the positive strand, that is, left to right on the genomic sequence (see param endCol).

  4. The strand position of the exon, '+' for positive, and '-' for negative (see param strandCol).

genomeSeq

DNAStringSet: The loaded genome sequence as a DNAStringSet object, as per R's Biostrings package (Pagès et al.). The sequence names must match the chromosome name in annotDT.

chromCol

Character: The chromosome column name in annotDT. Default is 'CHROM'.

startCol

Character: The exon start position in annotDT. Default is 'START'.

endCol

Character: The exon end position in annotDT. Default is 'END'.

strandCol

Character: The exon strand position in annotDT. Default is 'STRAND'.

Value

Returns a list, with each indexed item a character vector, the exon extracted from the genome. Exons are ordered based on their order of transcriptions, and are in the correct orientation.


j-a-thia/genomalicious documentation built on Oct. 19, 2024, 7:51 p.m.