add_context_seq: Annotate splice junctions with resulting transcript sequence

View source: R/add_context_seq.R

add_context_seqR Documentation

Annotate splice junctions with resulting transcript sequence

Description

Annotate splice junctions with resulting transcript sequence

Usage

add_context_seq(df, transcripts, size = 400, bsg = NULL, keep_ranges = FALSE)

Arguments

df

A data.frame with splice junctions in rows and at least the columns:

  • junc_id junction id consisting of genomic coordinates

  • tx_id the ID of the affected transcript (see add_tx)

transcripts

as a named GRangesList of transcripts

size

the size of the output sequence around the junction position (might be shorter if transcripts is shorter)

bsg

BSgenome object such as BSgenome.Hsapiens.UCSC.hg19

keep_ranges

Should GRanges of transcripts and modified transcript be kept? If TRUE, the list columns tx_lst and tx_mod_lst are added to the output.

Value

A data.frame with the same rows as the input df but with the following additional column(s):

  • tx_mod_id an identifier made from tx_id and junc_id

  • junc_pos_tx the junction position in the modified transcript sequence

  • cts_seq the context sequence

  • cts_junc_pos the junction position in the context sequence

  • cts_size the size of the context sequence

  • cts_id a unique id for the context sequence as hash value using the XXH128 hash algorithm

If the keep_ranges is TRUE, the following additional columns are added to the output data.frame:

  • tx_lst a list of GRanges with the original transcript as provided in tx_id column and transcripts object..

  • tx_mod_lst a list of GRanges with the modified transcript (see modify_tx)

Examples


requireNamespace("BSgenome.Hsapiens.UCSC.hg19", quietly = TRUE)
bsg <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19

add_context_seq(toy_junc_df, toy_transcripts, size = 20, bsg = bsg)


TRON-Bioinformatics/splice2neo documentation built on Nov. 9, 2024, 5:28 p.m.