plotCoverage: RNA-seq Coverage Plots and Genome Tracks

View source: R/plotCoverage.R

plotCoverageR Documentation

RNA-seq Coverage Plots and Genome Tracks

Description

Generate plotly / ggplot RNA-seq genome and coverage plots from command line. Note that these are legacy functions. More expansive functionality is available using getCoverageData / getPlotObject / plotView functions.

Usage

plotCoverage(
  se,
  Event,
  Gene,
  seqname,
  start,
  end,
  coordinates,
  strand = c("*", "+", "-"),
  zoom_factor = 0.2,
  bases_flanking = 100,
  tracks,
  track_names = tracks,
  condition,
  ribbon_mode = c("sd", "ci", "sem", "none"),
  selected_transcripts = "",
  reverseGenomeCoords = FALSE,
  plotJunctions = FALSE,
  junctionThreshold = 0.01,
  plot_key_isoforms = FALSE,
  condense_tracks = FALSE,
  stack_tracks = FALSE,
  t_test = FALSE,
  norm_event,
  usePlotly = FALSE
)

plotGenome(
  se,
  reference_path,
  Event,
  Gene,
  seqname,
  start,
  end,
  coordinates,
  zoom_factor = 0.2,
  bases_flanking = 100,
  reverseGenomeCoords = FALSE,
  condense_tracks = FALSE,
  selected_transcripts = "",
  plot_key_isoforms = FALSE,
  usePlotly = FALSE
)

Arguments

se

A NxtSE object, created by makeSE. COV files must be linked to the NxtSE object. To do this, see the example in makeSE. Required by plotCoverage. Not required by plotGenome if reference_path is supplied.

Event

The EventName of the IR / alternative splicing event to be displayed. Use rownames(se) to display a list of valid events.

Gene

Whether to use the range for the given Gene. If given, overrides Event (but Event or norm_event will be used to normalise by condition). Valid Gene entries include gene_id (Ensembl ID) or gene_name (Gene Symbol).

seqname, start, end

The chromosome (string) and genomic start/end coordinates (numeric) of the region to display. If present, overrides both Event and Gene. E.g. for a given region of chr1:10000-11000, use the parameters: ⁠seqname = "chr1", start = 10000, end = 11000⁠

coordinates

A string specifying genomic coordinates can be given instead of ⁠seqname,start,end⁠. Must be of the format "chr:start-end", e.g. "chr1:10000-11000"

strand

Whether to show coverage of both strands "*" (default), or from the "+" or "-" strand only.

zoom_factor

Zoom out from event. Each level of zoom zooms out by a factor of 3. E.g. for a query region of chr1:10000-11000, if a zoom_factor of 1.0 is given, chr1:99000-12000 will be displayed.

bases_flanking

(Default = 100) How many bases flanking the zoomed window. Useful when used in conjunction with zoom_factor == 0. E.g. for a given region of chr1:10000-11000, if zoom_factor = 0 and bases_flanking = 100, the region chr1:9900-11100 will be displayed.

tracks

The names of individual samples, or the names of the different conditions to be plotted. For the latter, set condition to the specified condition category.

track_names

The names of the tracks to be displayed. If omitted, the track_names will default to the input in tracks

condition

To display normalised coverage per condition, set this to the condition category. If omitted, tracks are assumed to refer to the names of individual samples.

ribbon_mode

(default "sd") Whether coverage ribbons signify standard deviation "sd", 95% confidence interval "ci", standard error of the mean "sem", or none "none". Only applicable when condition is set.

selected_transcripts

(Optional) A vector containing transcript ID or transcript names of transcripts to be displayed on the gene annotation track. Useful to remove minor isoforms that are not relevant to the samples being displayed.

reverseGenomeCoords

(default FALSE) Whether to reverse the genomic coordinates - helpful for intuitive plotting of negative-strand genes

plotJunctions

(default FALSE) If TRUE, sashimi plot junction arcs are plotted. Currently only implemented for plots of individual samples.

junctionThreshold

(default 0.01) The threshold expression of junction reads below which junction arcs will be omitted. This removes cluttering of junction arcs from lowly-expressed (rare) junctions. For individual tracks, this is the fraction of coverage height. For by-condition tracks, this is a PSI threshold.

plot_key_isoforms

(default FALSE) If TRUE, only transcripts involved in the selected Event or pair of Events will be displayed.

condense_tracks

(default FALSE) Whether to collapse the transcript track annotations by gene.

stack_tracks

(default FALSE) Whether to graph all the conditions on a single coverage track. If set to TRUE, each condition will be displayed in a different colour on the same track. Ignored if condition is not set.

t_test

(default FALSE) Whether to perform a pair-wise T-test. Only used if there are TWO condition tracks.

norm_event

Whether to normalise by an event different to that given in "Event". The difference between this and Event is that the genomic coordinates can be centered around a different Event, Gene or region as given in seqname/start/end. If norm_event is different to Event, norm_event will be used for normalisation and Event will be used to define the genomic coordinates of the viewing window. norm_event is required if Event is not set and condition is set.

usePlotly

If TRUE, returns a covPlotly object containing the plotly-based interactive plot. If FALSE, returns a ggplot object.

reference_path

The path of the reference generated by Build-Reference-methods. Required by plotGenome if a NxtSE object is not specified.

Details

In RNA sequencing, alignments to spliced transcripts will "skip" over genomic regions of introns. This can be illustrated in a plot using a horizontal genomic axis, with the vertical axis representing the number of alignments covering each nucleotide. As a result, the coverage "hills" represent the expression of exons, and "valleys" to introns.

Different alternatively-spliced isoforms thus produce different coverage patterns. The change in the coverage across an alternate exon relative to its constitutively-included flanking exons, for example, represents its alternative inclusion or skipping. Similarly, elevation of intron valleys represent increased intron retention.

With multiple replicates per sample, coverage is dependent on library size and gene expression. To compare alternative splicing ratios, normalisation of the coverage of the alternate exon (or alternatively retained intron) relative to their constitutive flanking exons, is required. There is no established method for this normalisation, and can be confounded in situations where flanking elements are themselves alternatively spliced.

SpliceWiz performs this coverage normalisation using the same method as its estimate of spliced / intronic transcript abundance using the SpliceOver method (see details section in collateData). This normalisation can be applied to correct for library size and gene expression differences between samples of the same experimental condition. After normalisation, mean and variance of coverage can be computed as ratios relative to total transcript abundance. This method can visualise alternatively included genomic regions including casette exons, alternate splice site usage, and intron retention.

plotCoverage generates plots showing depth of alignments to the genomic axis. Plots can be generated for individual samples or samples grouped by experimental conditions. In the latter, mean and 95% confidence intervals are shown.

plotGenome generates genome transcript tracks only. Protein-coding regions are denoted by thick rectangles, whereas non-protein coding transcripts or untranslated regions are denoted with thin rectangles. Introns are denoted as lines.

Value

For plotCoverage and plotGenome:

  • If usePlotly = FALSE returns a patchwork-assembled static plot

  • If usePlotly = TRUE returns a covPlotly object, which generates a plotly interactive plot when shown using show()

Functions

  • plotCoverage(): Legacy function - works by internally calling getCoverageData(), getPlotObject(), then plotView()

  • plotGenome(): Legacy function - works by internally calling getGenomeData(), followed by plotAnnoTrack()

Examples

se <- SpliceWiz_example_NxtSE(novelSplicing = TRUE)

# Assign annotation of the experimental conditions
colData(se)$treatment <- rep(c("A", "B"), each = 3)

# Verify that the COV files are linked to the NxtSE object:
covfile(se)

# Plot the genome track only, with specified gene:
plotGenome(se, Gene = "SRSF3")

# View the genome track, specifying a genomic region via coordinates:
plotGenome(se, coordinates = "chrZ:10000-20000")

# Return a list of ggplot and plotly objects, also plotting junction counts
plotCoverage(
    se = se,
    Event = "SE:SRSF3-203-exon4;SRSF3-202-int3",
    tracks = colnames(se)[1:4], plotJunctions = TRUE
)

# Plot the same, but as a plotly interactive plot

if(interactive()) {
    p <- plotCoverage(
        se = se,
        Event = "SE:SRSF3-203-exon4;SRSF3-202-int3",
        tracks = colnames(se)[1:4], plotJunctions = TRUE,
        usePlotly = TRUE
    )
    show(p)
}

# Plot by condition "treatment", including provisional PSIs
plotCoverage(
    se = se,
    Event = "SE:SRSF3-203-exon4;SRSF3-202-int3",
    tracks = c("A", "B"), condition = "treatment", plotJunctions = TRUE
)

# As above, but stack all traces into the same track
# - NB: plotJunctions is disabled when `stack_tracks = TRUE`
plotCoverage(
    se = se,
    Event = "SE:SRSF3-203-exon4;SRSF3-202-int3",
    tracks = c("A", "B"), condition = "treatment", stack_tracks = TRUE
)

# Plot the above, but unstancked, and with t-test track
# - NB: plotJunctions is disabled when `stack_tracks = TRUE`
plotCoverage(
    se = se,
    Event = "SE:SRSF3-203-exon4;SRSF3-202-int3",
    tracks = c("A", "B"), condition = "treatment", t_test = TRUE
)

# Select only transcripts involved in the selected alternative splicing event
plotCoverage(
    se = se,
    Event = "SE:SRSF3-203-exon4;SRSF3-202-int3",
    tracks = colnames(se)[1:4],
    plot_key_isoforms = TRUE
)

alexchwong/SpliceWiz documentation built on Oct. 15, 2024, 10:12 a.m.