plotFunctions: Visualization functions for FRASER

Description Usage Arguments Details Value Examples

Description

The FRASER package provides mutliple functions to visualize the data and the results of a full data set analysis.

Plots the p values over the delta psi values, known as volcano plot. Visualizes per sample the outliers. By type and aggregate by gene if requested.

Plot the number of aberrant events per samples

Plots the observed split reads of the junction of interest over all reads coming from the given donor/acceptor.

Plots the expected psi value over the observed psi value of the given junction.

Plots the quantile-quantile plot

Histogram of the geometric mean per junction based on the filter status

Histogram of minimal delta psi per junction

Count correlation heatmap function

Usage

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
## S4 method for signature 'FraserDataSet'
plotVolcano(
  object,
  sampleID,
  type = c("psi3", "psi5", "theta"),
  basePlot = TRUE,
  aggregate = FALSE,
  main = NULL,
  label = NULL,
  deltaPsiCutoff = 0.3,
  padjCutoff = 0.1,
  ...
)

## S4 method for signature 'FraserDataSet'
plotAberrantPerSample(
  object,
  main,
  type = c("psi3", "psi5", "theta"),
  padjCutoff = 0.1,
  zScoreCutoff = NA,
  deltaPsiCutoff = 0.3,
  aggregate = TRUE,
  BPPARAM = bpparam(),
  ...
)

plotExpression(
  fds,
  type = c("psi5", "psi3", "theta"),
  site = NULL,
  result = NULL,
  colGroup = NULL,
  basePlot = TRUE,
  main = NULL,
  label = "aberrant",
  ...
)

plotExpectedVsObservedPsi(
  fds,
  type = c("psi5", "psi3", "theta"),
  idx = NULL,
  result = NULL,
  colGroup = NULL,
  main = NULL,
  basePlot = TRUE,
  label = "aberrant",
  ...
)

## S4 method for signature 'FraserDataSet'
plotQQ(
  object,
  type = NULL,
  idx = NULL,
  result = NULL,
  aggregate = FALSE,
  global = FALSE,
  main = NULL,
  conf.alpha = 0.05,
  samplingPrecision = 3,
  basePlot = TRUE,
  label = "aberrant",
  Ncpus = min(3, getDTthreads()),
  ...
)

## S4 method for signature 'FraserDataSet'
plotEncDimSearch(
  object,
  type = c("psi3", "psi5", "theta"),
  plotType = c("auc", "loss")
)

plotFilterExpression(
  fds,
  bins = 200,
  legend.position = c(0.8, 0.8),
  onlyVariableIntrons = FALSE
)

plotFilterVariability(
  fds,
  bins = 200,
  legend.position = c(0.8, 0.8),
  onlyExpressedIntrons = FALSE
)

## S4 method for signature 'FraserDataSet'
plotCountCorHeatmap(
  object,
  type = c("psi5", "psi3", "theta"),
  logit = FALSE,
  topN = 50000,
  topJ = 5000,
  minMedian = 1,
  minCount = 10,
  main = NULL,
  normalized = FALSE,
  show_rownames = FALSE,
  show_colnames = FALSE,
  minDeltaPsi = 0.1,
  annotation_col = NA,
  annotation_row = NA,
  border_color = NA,
  nClust = 5,
  plotType = c("sampleCorrelation", "junctionSample"),
  sampleClustering = NULL,
  plotMeanPsi = TRUE,
  plotCov = TRUE,
  ...
)

Arguments

object, fds

An FraserDataSet object.

sampleID

A sample ID which should be plotted. Can also be a vector. Integers are treated as indices.

type

The psi type: either psi5, psi3 or theta (for SE).

basePlot

if TRUE (default), use the R base plot version, else use the plotly framework.

aggregate

If TRUE, the pvalues are aggregated by gene (default), otherwise junction level pvalues are used (default for Q-Q plot).

main

Title for the plot, if missing a default title will be used.

label

Indicates the genes or samples that will be labelled in the plot (only for basePlot=TRUE). Setting label="aberrant" will label all aberrant genes or samples. Labelling can be turned off by setting label=NULL. The user can also provide a custom list of gene symbols or sampleIDs.

padjCutoff, zScoreCutoff, deltaPsiCutoff

Significance, Z-score or delta psi cutoff to mark outliers

...

Additional parameters passed to plot() or plot_ly() if not stated otherwise in the details for each plot function

BPPARAM

BiocParallel parameter to use.

result

The result table to be used by the method.

colGroup

Group of samples that should be colored.

idx, site

A junction site ID or gene ID or one of both, which should be plotted. Can also be a vector. Integers are treated as indices.

global

Flag to plot a global Q-Q plot, default FALSE

conf.alpha

If set, a confidence interval is plotted, defaults to 0.05

samplingPrecision

Plot only non overlapping points in Q-Q plot to reduce number of points to plot. Defines the digits to round to.

Ncpus

Number of cores to use.

plotType

The type of plot that should be shown as character string. For plotEncDimSearch, it has to be either "auc" for a plot of the area under the curve (AUC) or "loss" for the model loss. For the correlation heatmap, it can be either "sampleCorrelation" for a sample-sample correlation heatmap or "junctionSample" for a junction-sample correlation heatmap.

bins

Set the number of bins to be used in the histogram.

legend.position

Set legend position (x and y coordinate), defaults to the top right corner.

onlyVariableIntrons

Logical value indicating whether to show only introns that also pass the variability filter. Defaults to FALSE.

onlyExpressedIntrons

Logical value indicating whether to show only introns that also pass the expression filter. Defaults to FALSE.

logit

If TRUE, the default, psi values are plotted in logit space.

topN, topJ

Top x most variable junctions that should be used in the heatmap. TopN is used for sample-sample correlation heatmaps and topJ for junction-sample correlation heatmaps.

minMedian, minCount, minDeltaPsi

Minimal median (m ≥ 1), delta psi (|Δψ| > 0.1), read count (n ≥ 10) value of a junction to be considered for the correlation heatmap.

normalized

If TRUE, the normalized psi values are used, the default, otherwise the raw psi values

show_rownames, show_colnames

Logical value indicating whether to show row or column names on the heatmap axes.

annotation_col, annotation_row

Row or column annotations that should be plotted on the heatmap.

border_color

Sets the border color of the heatmap

nClust

Number of clusters to show in the row and column dendrograms.

sampleClustering

A clustering of the samples that should be used as an annotation of the heatmap.

plotMeanPsi, plotCov

If TRUE, then the heatmap is annotated with the mean psi values or the junction coverage.

Details

This is the list of all plotting function provided by FRASER:

For a detailed description of each plot function please see the details. Most of the functions share the same parameters.

plotAberrantPerSample: The number of aberrant events per sample are plotted sorted by rank. The ... parameters are passed on to the aberrant function.

plotVolcano: the volcano plot is sample-centric. It plots for a given sample and psi type the negative log10 nominal P-values against the delta psi values for all splice sites or aggregates by gene if requested.

plotExpression: This function plots for a given site the read count at this site (i.e. K) against the total coverage (i.e. N) for the given psi type (ψ5, ψ3, or θ (SE)) for all samples.

plotQQ: the quantile-quantile plot for a given gene or if global is set to TRUE over the full data set. Here the observed P-values are plotted against the expected ones in the negative log10 space.

plotExpectedVsObservedPsi: A scatter plot of the observed psi against the predicted psi for a given site.

plotCountCorHeatmap: The correlation heatmap of the count data either of the full data set (i.e. sample-sample correlations) or of the top x most variable junctions (i.e. junction-sample correlations). By default the values are log transformed and row centered. The ... arguments are passed to the pheatmap function.

plotFilterExpression: The distribution of FPKM values. If the FraserDataSet object contains the passedFilter column, it will plot both FPKM distributions for the expressed introns and for the filtered introns.

plotFilterVariability: The distribution of maximal delta Psi values. If the FraserDataSet object contains the passedFilter column, it will plot both maximal delta Psi distributions for the variable introns and for the filtered (i.e. non-variable) introns.

plotEncDimSearch: Visualization of the hyperparameter optimization. It plots the encoding dimension against the achieved loss (area under the precision-recall curve). From this plot the optimum should be choosen for the q in fitting process.

Value

If base R graphics are used nothing is returned else the plotly or the gplot object is returned.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# create full FRASER object 
fds <- makeSimulatedFraserDataSet(m=40, j=200)
fds <- calculatePSIValues(fds)
fds <- filterExpressionAndVariability(fds, filter=FALSE)
# this step should be done for all splicing metrics and more dimensions
fds <- optimHyperParams(fds, "psi5", q_param=c(2,5,10,25))
fds <- FRASER(fds)

# QC plotting
plotFilterExpression(fds)
plotFilterVariability(fds)
plotCountCorHeatmap(fds, "theta")
plotCountCorHeatmap(fds, "theta", normalized=TRUE)
plotEncDimSearch(fds, type="psi5")

# extract results 
plotAberrantPerSample(fds)
plotVolcano(fds, "sample1", "psi5")

# dive into gene/sample level results
res <- results(fds)
res
plotExpression(fds, result=res[1])
plotQQ(fds, result=res[1])
plotExpectedVsObservedPsi(fds, type="psi5", res=res[1])

FRASER documentation built on Feb. 3, 2021, 2:01 a.m.