add_peptide: Annotate splice junctions with resulting CDS and peptide...

View source: R/add_peptide.R

add_peptideR Documentation

Annotate splice junctions with resulting CDS and peptide sequence

Description

Annotate splice junctions with resulting CDS and peptide sequence

Usage

add_peptide(df, cds, flanking_size = 14, bsg = NULL, keep_ranges = FALSE)

Arguments

df

A data.frame with splice junctions in rows and at least the columns:

  • junc_id junction id consisting of genomic coordinates

  • tx_id the ID of the affected transcript/CDS (see add_tx)

cds

as a named GRangesList of coding sequences (CDS) ranges

flanking_size

the number of wild-type amino acids flanking the junction or novel sequence caused by the splice junction to the left and to the right. Frame shift junctions are flanked by wild-type amino acids to the left and are annotated until the first stop codon.

bsg

BSgenome object such as BSgenome.Hsapiens.UCSC.hg19

keep_ranges

Should GRanges of CDS and modified CDS be kept? If TRUE, the list columns cds_lst and cds_mod_lst are added to the output.

Value

A data.frame as the input with the additional column(s):

  • cds_mod_id an identifier made from tx_id and junc_id

  • junc_pos_cds the junction position in the modified CDS sequence

  • frame_shift Indicator whether junction leads to frame shift.

  • is_first_reading_frame Indicator whether modified CDS sequence is translated into protein sequence using the 1st reading frame (i.e. reading is starting from the first nucleotide) for in-frame peptides.

  • normalized_cds_junc_pos The normalized position of the junction in the modified CDS sequence to the left junction side.

  • protein the full protein sequence of the translated modified CDS.

  • normalized_protein_junc_pos The normalized position of the junction in the protein sequence to the left junction side.

  • peptide_context_junc_pos The junction position relative to the peptide_context sequence

  • junc_in_orf Indicator whether the junction is located in an open reading frame.

  • peptide_context_seq_raw the peptide sequence around the junction including stop codons.

  • peptide_context the peptide sequence around the junction truncated after stop codons.

  • truncated_cds Indicator whether the mutated gene product is a truncated from of the WT gene product. If TRUE, peptide_context = NA.

  • cds_description Descriptor of of the mutated gene product. Can be one of c("mutated cds", "truncated cds", "no mutated gene product", "no wt cds", "not in ORF")

If the keep_ranges is TRUE, the following additional columns are added to the output data.frame:

  • cds_lst a list of GRanges with the original CDS as provided in tx_id column and cds object..

  • cds_mod_lst a list of GRanges with the modified CDS ranges.

Examples

requireNamespace("BSgenome.Hsapiens.UCSC.hg19", quietly = TRUE)
bsg <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19

add_peptide(toy_junc_df, toy_cds, bsg = bsg, flanking_size = 13)


TRON-Bioinformatics/splice2neo documentation built on March 25, 2024, 2:27 a.m.