plotCorrelation: Pairwise scatter plots and correlations of CAGE signal

plotCorrelationR Documentation

Pairwise scatter plots and correlations of CAGE signal

Description

Calculates the pairwise correlation between samples and creates a plot matrix showing the correlation coeficients in the upper triangle, the sample names in the diagonal, and the catter plots in the lower triangle.

Usage

plotCorrelation(
  object,
  what = c("CTSS", "consensusClusters"),
  values = c("raw", "normalized"),
  samples = "all",
  method = "pearson",
  tagCountThreshold = 1,
  applyThresholdBoth = FALSE,
  plotSize = 800
)

## S4 method for signature 'CAGEr'
plotCorrelation(
  object,
  what = c("CTSS", "consensusClusters"),
  values = c("raw", "normalized"),
  samples = "all",
  method = "pearson",
  tagCountThreshold = 1,
  applyThresholdBoth = FALSE,
  plotSize = 800
)

plotCorrelation2(
  object,
  what = c("CTSS", "consensusClusters"),
  values = c("raw", "normalized"),
  samples = "all",
  method = "pearson",
  tagCountThreshold = 1,
  applyThresholdBoth = FALSE,
  digits = 3
)

## S4 method for signature 'CAGEexp'
plotCorrelation2(
  object,
  what = c("CTSS", "consensusClusters"),
  values = c("raw", "normalized"),
  samples = "all",
  method = "pearson",
  tagCountThreshold = 1,
  applyThresholdBoth = FALSE,
  digits = 3
)

## S4 method for signature 'SummarizedExperiment'
plotCorrelation2(
  object,
  what = c("CTSS", "consensusClusters"),
  values = c("raw", "normalized"),
  samples = "all",
  method = "pearson",
  tagCountThreshold = 1,
  applyThresholdBoth = FALSE,
  digits = 3
)

## S4 method for signature 'DataFrame'
plotCorrelation2(
  object,
  what = c("CTSS", "consensusClusters"),
  values = c("raw", "normalized"),
  samples = "all",
  method = "pearson",
  tagCountThreshold = 1,
  applyThresholdBoth = FALSE,
  digits = 3
)

## S4 method for signature 'data.frame'
plotCorrelation2(
  object,
  what = c("CTSS", "consensusClusters"),
  values = c("raw", "normalized"),
  samples = "all",
  method = "pearson",
  tagCountThreshold = 1,
  applyThresholdBoth = FALSE,
  digits = 3
)

## S4 method for signature 'matrix'
plotCorrelation2(
  object,
  what = c("CTSS", "consensusClusters"),
  values = c("raw", "normalized"),
  samples = "all",
  method = "pearson",
  tagCountThreshold = 1,
  applyThresholdBoth = FALSE,
  digits = 3
)

Arguments

object

A CAGEr object or (only for plotCorrelation2) a SummarizedExperiment or an expression table as a DataFrame, data.frame or matrix object.

what

The clustering level to be used for plotting and calculating correlations. Can be either "CTSS" to use individual TSSs or "consensusClusters" to use consensus clusters, i.e. entire promoters. Ignored for anything else than CAGEr objects.

values

Use either "raw" (default) or "normalized" CAGE signal. Ignored for plain expression tables.

samples

Character vector indicating which samples to use. Can be either "all" to select all samples in a CAGEr object, or a subset of valid sample labels as returned by the sampleLabels function.

method

A character string indicating which correlation coefficient should be computed. Passed to cor function. Can be one of "pearson", "spearman", or "kendall".

tagCountThreshold

Only TSSs with tag count >= tagCountThreshold in either one (applyThresholdBoth = FALSE) or both samples (applyThresholdBoth = TRUE) are plotted and used to calculate correlation.

applyThresholdBoth

See tagCountThreshold above.

plotSize

Size of the individual comparison plot in pixels - the total size of the resulting png will be length(samples) * plotSize in both dimensions. Ignored in plotCorrelation2.

digits

The number of significant digits for the data to be kept in log scale. Ignored in plotCorrelation. In plotCorrelation2, the number of points plotted is considerably reduced by rounding the point coordinates to a small number of significant digits before removing duplicates. Chose a value that makes the plot visually indistinguishable with non-deduplicated data, by making tests on a subset of the data.

Details

In the scatter plots, a pseudo-count equal to half the lowest score is added to the null values so that they can appear despite logarithmic scale.

SummarizedExperiment objects are expected to contain raw tag counts in a “counts” assay and the normalized expression scores in a “normalized” assay.

Avoid using large matrix objects as they are coerced to DataFrame class without special care for efficiency.

plotCorrelation2 speeds up the plotting by a) deduplicating that data: no point is plot twice at the same coordinates, b) rounding the data so that indistinguishable positions are plotted only once, c) using a black square glyph for the points, d) caching some calculations that are made repeatedly (to determine where to plot the correlation coefficients), and e) preventing coercion of DataFrames to data.frames.

Value

Displays the plot and returns a matrix of pairwise correlations between selected samples. The scatterplots of plotCorrelation are colored according to the density of points, and in plotCorrelation2 they are just black and white, which is much faster to plot. Note that while the scatterplots are on a logarithmic scale with pseudocount added to the zero values, the correlation coefficients are calculated on untransformed (but thresholded) data.

Author(s)

Vanja Haberle

Charles Plessy

See Also

Other CAGEr plot functions: TSSlogo(), hanabiPlot(), plotAnnot(), plotExpressionProfiles(), plotInterquantileWidth(), plotReverseCumulatives()

Examples


plotCorrelation2(exampleCAGEexp, what = "consensusClusters", value = "normalized")


charles-plessy/CAGEr documentation built on Oct. 27, 2024, 10:11 p.m.