locate_codons: Locate codons in transcripts

Description Usage Arguments Details Value

View source: R/locate-codons.R

Description

Locate genome coordinates of a desired set of codons given a data.frame of CDS coordinates (see CDS), and its corresponding BSgenome sequence database.

Usage

1
2
3
4
5
6
7
8
locate_codons(
  cds,
  genome,
  codons = c("CAA", "CAG", "CGA", "TGG", "TGG"),
  positions = c(1L, 1L, 1L, 2L, 3L),
  switch_strand = c(F, F, F, T, T),
  cores = getOption("mc.cores", 1L)
)

Arguments

cds

A data.frame of CDS coordinates. See CDS for details on required columns.

genome

A BSgenome sequence database, or a Biostrings. CDS coordinates in cds should correspond to this genome assembly.

codons

A character vecor of codons to consider. Defaults to c('CAA', 'CAG', 'CGA', 'TGG', 'TGG') which are all of the codons with iSTOP targetable bases.

positions

An integer vector of positions in codons to use for returned coordinates. Should be the same length as codons. Defaults to c(1L, 1L, 1L, 2L, 3L) which are the positions in codons that can be targeted with iSTOP.

switch_strand

A logical vector indicating whether the corresponding codon is targeted on the opposite strand. Determines the resulting value of sg_strand. Should be the same length as codons. Defaults to c(F, F, F, T, T)

cores

Number of cores to use for parallel processing with pblapply

Details

Each transcript is processed independently based on the tx column of the cds data.frame.

CDS validation - Each CDS is validated by checking that the coordinates result in a DNA sequence that begins with a start codon, ends with a stop codon, has a length that is a multiple of 3 and has no internal in-frame stop codons.

No match? - If a CDS passes validation, but does not have any of the considered codons, the transcript will still be included in the resulting data.frame, but coordinates will be missing (i.e. NA).

Value

A data.frame with the following columns where each row represents a single targetable coordinate (codons with multiple targetable coordinates will have a separate row for each):


CicciaLab/iSTOP documentation built on May 9, 2021, 4:55 p.m.