Description Usage Arguments Details Value
View source: R/locate-codons.R
Locate genome coordinates of a desired set of codons given a data.frame of CDS coordinates (see CDS), and its corresponding BSgenome sequence database.
1 2 3 4 5 6 7 8 | locate_codons(
cds,
genome,
codons = c("CAA", "CAG", "CGA", "TGG", "TGG"),
positions = c(1L, 1L, 1L, 2L, 3L),
switch_strand = c(F, F, F, T, T),
cores = getOption("mc.cores", 1L)
)
|
cds |
A data.frame of CDS coordinates. See CDS for details on required columns. |
genome |
A BSgenome sequence database,
or a Biostrings. CDS coordinates in |
codons |
A character vecor of codons to consider. Defaults to
|
positions |
An integer vector of positions in |
switch_strand |
A logical vector indicating whether the corresponding
|
cores |
Number of cores to use for parallel processing with pblapply |
Each transcript is processed independently based on the tx
column of the cds
data.frame.
CDS validation - Each CDS is validated by checking that the coordinates result in a DNA sequence that begins with a start codon, ends with a stop codon, has a length that is a multiple of 3 and has no internal in-frame stop codons.
No match? - If a CDS passes validation, but does not have any of the
considered codons, the transcript will still be included in the resulting
data.frame, but coordinates will be missing (i.e. NA
).
A data.frame with the following columns where each row represents a single targetable coordinate (codons with multiple targetable coordinates will have a separate row for each):
COLUMN-NAME DATA-TYPE
DESCRIPTION
tx chr
Transcript symbol
gene chr
Gene symbol
exon int
Exon rank in gene
pep_length int
Lenth of peptide
cds_length int
Length of CDS DNA
chr chr
Chromosome
strand chr
Strand (+/-)
sg_strand chr
Guide strand (+/-)
aa_target chr
One letter code of targeted amino acid
codon chr
Three letter codon
aa_coord int
Coordinate of targeted amino acid (start == 1)
cds_coord int
Coordinate of targeted base in CDS (start == 1)
genome_coord int
Coordinate of targeted base in genome
NMD_pred lgl
Is nonsense-mediated-decay predicted (i.e. target base is >56 bases upstream of last exon junction)
rel_position dbl
Relative position in CDS of the targeted base
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.