shorten_gaps: Improve transcript structure visualization by shortening gaps
In dzhang32/ggtranscript: Visualizing Transcript Structure and Annotation using 'ggplot2'

shorten_gaps

R Documentation

Improve transcript structure visualization by shortening gaps

Description

For a given set of exons and introns, shorten_gaps() reduces the width of gaps (regions that do not overlap any exons) to a user-inputted target_gap_width. This can be useful when visualizing transcripts that have long introns, to hone in on the regions of interest (i.e. exons) and better compare between transcript structures.

Usage

shorten_gaps(exons, introns, group_var = NULL, target_gap_width = 100L)

Arguments

`exons`	`data.frame()` contains exons which can originate from multiple transcripts differentiated by `group_var`.
`introns`	`data.frame()` the intron co-ordinates corresponding to the input `exons`. This can be created by applying `to_intron()` to the `exons`. If introns originate from multiple transcripts, they must be differentiated using `group_var`. If a user is not using `to_intron()`, they must make sure intron start/ends are defined precisely as the adjacent exon boundaries (rather than exon end + 1 and exon start - 1).
`group_var`	`character()` if input data originates from more than 1 transcript, `group_var` must specify the column that differentiates transcripts (e.g. "transcript_id").
`target_gap_width`	`integer()` the width in base pairs to shorten the gaps to.

Details

After shorten_gaps() reduces the size of gaps, it will re-scale exons and introns to preserve exon alignment. This process will only reduce the width of input introns, never exons. Importantly, the outputted re-scaled co-ordinates should only be used for visualization as they will not match the original genomic coordinates.

Value

data.frame() contains the re-scaled co-ordinates of introns and exons of each input transcript with shortened gaps.

Examples


library(magrittr)
library(ggplot2)

# to illustrate the package's functionality
# ggtranscript includes example transcript annotation
pknox1_annotation %>% head()

# extract exons
pknox1_exons <- pknox1_annotation %>% dplyr::filter(type == "exon")
pknox1_exons %>% head()

# to_intron() is a helper function included in ggtranscript
# which is useful for converting exon co-ordinates to introns
pknox1_introns <- pknox1_exons %>% to_intron(group_var = "transcript_name")
pknox1_introns %>% head()

# for transcripts with long introns, the exons of interest
# can be difficult to visualize clearly when using the default scale
pknox1_exons %>%
    ggplot(aes(
        xstart = start,
        xend = end,
        y = transcript_name
    )) +
    geom_range() +
    geom_intron(
        data = pknox1_introns,
        arrow.min.intron.length = 3500
    )

# in such cases it can be useful to rescale the exons and introns
# using shorten_gaps() which shortens regions that do not overlap an exon
pknox1_rescaled <-
    shorten_gaps(pknox1_exons, pknox1_introns, group_var = "transcript_name")

pknox1_rescaled %>% head()

# this allows us to visualize differences in exonic structure more clearly
pknox1_rescaled %>%
    dplyr::filter(type == "exon") %>%
    ggplot(aes(
        xstart = start,
        xend = end,
        y = transcript_name
    )) +
    geom_range() +
    geom_intron(
        data = pknox1_rescaled %>% dplyr::filter(type == "intron"),
        arrow.min.intron.length = 300
    )

# shorten_gaps() can be used in combination with to_diff()
# to further highlight differences in exon structure
# here, all other transcripts are compared to the MANE-select transcript
pknox1_rescaled_diffs <- to_diff(
    exons = pknox1_rescaled %>%
        dplyr::filter(type == "exon", transcript_name != "PKNOX1-201"),
    ref_exons = pknox1_rescaled %>%
        dplyr::filter(type == "exon", transcript_name == "PKNOX1-201"),
    group_var = "transcript_name"
)

pknox1_rescaled %>%
    dplyr::filter(type == "exon") %>%
    ggplot(aes(
        xstart = start,
        xend = end,
        y = transcript_name
    )) +
    geom_range() +
    geom_intron(
        data = pknox1_rescaled %>% dplyr::filter(type == "intron"),
        arrow.min.intron.length = 300
    ) +
    geom_range(
        data = pknox1_rescaled_diffs,
        aes(fill = diff_type),
        alpha = 0.2
    )

dzhang32/ggtranscript documentation built on Aug. 29, 2024, 2:43 a.m.

dzhang32/ggtranscript index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

dzhang32/ggtranscript
Visualizing Transcript Structure and Annotation using 'ggplot2'

shorten_gaps: Improve transcript structure visualization by shortening gaps
In dzhang32/ggtranscript: Visualizing Transcript Structure and Annotation using 'ggplot2'

Improve transcript structure visualization by shortening gaps

Description

Usage

Arguments

Details

Value

Examples

Related to shorten_gaps in dzhang32/ggtranscript...

R Package Documentation

Browse R Packages

We want your feedback!

dzhang32/ggtranscript Visualizing Transcript Structure and Annotation using 'ggplot2'

shorten_gaps: Improve transcript structure visualization by shortening gaps In dzhang32/ggtranscript: Visualizing Transcript Structure and Annotation using 'ggplot2'

Improve transcript structure visualization by shortening gaps

Description

Usage

Arguments

Details

Value

Examples

Related to shorten_gaps in dzhang32/ggtranscript...

R Package Documentation

Browse R Packages

We want your feedback!

dzhang32/ggtranscript
Visualizing Transcript Structure and Annotation using 'ggplot2'

shorten_gaps: Improve transcript structure visualization by shortening gaps
In dzhang32/ggtranscript: Visualizing Transcript Structure and Annotation using 'ggplot2'