split_on_utr_and_add_feature: Split GRanges object based on cds
In chimeraviz: Visualization tools for gene fusions

Description Usage Arguments Value Examples

This function will look for ranges (exons) in the GRanges object that has the coding DNA sequence starting or stopping within it. If found, these exons are split, and each exon in the GRanges object will be tagged as either "protein_coding", "5utr", or "3utr". The returned GRanges object will have feature values set in mcols(gr)$feature reflecting this.

1	split_on_utr_and_add_feature(gr)

`gr`	The GRanges object we want to split and tag with feature info.

An updated GRanges object with feature values set.

# Load fusion data and choose a fusion object:
defuseData <- system.file(
  "extdata",
  "defuse_833ke_results.filtered.tsv",
  package="chimeraviz")
fusions <- import_defuse(defuseData, "hg19", 1)
fusion <- get_fusion_by_id(fusions, 5267)
# Create edb object
edbSqliteFile <- system.file(
  "extdata",
  "Homo_sapiens.GRCh37.74.sqlite",
  package="chimeraviz")
edb <- ensembldb::EnsDb(edbSqliteFile)
# Get all exons for all transcripts in the genes in the fusion transcript
allTranscripts <- ensembldb::exonsBy(
  edb,
  filter = list(
    AnnotationFilter::GeneIdFilter(
      c(
        partner_gene_ensembl_id(upstream_partner_gene(fusion)),
        partner_gene_ensembl_id(downstream_partner_gene(fusion))))),
  columns = c(
    "gene_id",
    "gene_name",
    "tx_id",
    "tx_cds_seq_start",
    "tx_cds_seq_end",
    "exon_id"))
# Extract one of the GRanges objects
gr <- allTranscripts[[1]]
# Check how many ranges there are here
length(gr)
# Should be 9 ranges
# Split the ranges containing the cds start/stop positions and add feature
# values:
gr <- split_on_utr_and_add_feature(gr)
# Check the length again
length(gr)
# Should be 11 now, as the range containing the cds_strat position and the
# range containing the cds_stop position has been split into separate ranges