plotReverseCumulatives: Plot reverse cumulative number of CAGE tags per CTSS

plotReverseCumulativesR Documentation

Plot reverse cumulative number of CAGE tags per CTSS

Description

Plots the reverse cumulative distribution of the expression values of the CTSS for all CAGE datasets present in the CAGEexp object. The horizontal axis represents an expression value and the vertical axis represents the number of CTSS positions supported by >= of that value. The plot uses a log-log scale. Use these plots as help in choosing the parameters range of values and the referent slope for power-law normalization (Balwierz et al., 2009).

Usage

plotReverseCumulatives(
  object,
  values = c("raw", "normalized"),
  fitInRange = c(10, 1000),
  group = NULL
)

## S4 method for signature 'CAGEexp'
plotReverseCumulatives(
  object,
  values = c("raw", "normalized"),
  fitInRange = c(10, 1000),
  group = NULL
)

## S4 method for signature 'GRangesList'
plotReverseCumulatives(
  object,
  values = c("raw", "normalized"),
  fitInRange = c(10, 1000),
  group = NULL
)

## S4 method for signature 'GRanges'
plotReverseCumulatives(
  object,
  values = c("raw", "normalized"),
  fitInRange = c(10, 1000),
  group = NULL
)

Arguments

object

A CAGEexp object

values

Plot raw CAGE tag counts (default) or normalized values.

fitInRange

An integer vector with two values specifying a range of tag count values to be used for fitting a power-law distribution to reverse cumulatives. Ignored is set to NULL. See Details.

group

The name of a column data of the CAGEexp object, to be used to facet the plot. If NULL (default), all the distributions will be plotted together. Set to sampleLabels to plot each sample separately.

Details

A power law distribution is fitted to each reverse cumulative using the values in the range specified fitInRange. The fitted distribution is defined by

y = -1 * alpha * x + beta

on the log-log scale, and the value of alpha for each sample is shown on the plot's legend. In addition, a suggested referent power law distribution to which all samples could be normalized is drawn on the plot and corresponding parameters (slope alpha and total number of tags T) are denoted on the plot. This referent distribution is chosen so that its slope (alpha) is the median of slopes fitted to individual samples and its total number of tags (T) is the power of 10 nearest to the median number of tags of individual samples. Resulting plots are helpful in deciding whether power-law normalization is appropriate for given samples and reported alpha values aid in choosing optimal alpha value power law normalization (see normalizeTagCount for details).

Value

A ggplot2::ggplot object containing the plots. The plot can be further modified to change its title or axis labels (see ggplot2::labs). The legend can be removed with ggplot2::guides(col=FALSE).

Author(s)

Vanja Haberle (original work)

Charles Plessy (port to ggplot2)

References

Balwierz et al. (2009) Methods for analyzing deep sequencing expression data: constructing the human and mouse promoterome with deepCAGE data, Genome Biology 10(7):R79. https://doi.org/10.1186/gb-2009-10-7-r79

See Also

normalizeTagCount

Other CAGEr plot functions: TSSlogo(), hanabiPlot(), plotAnnot(), plotCorrelation(), plotExpressionProfiles(), plotInterquantileWidth()

Other CAGEr normalised data functions: normalizeTagCount()

Examples

exampleCAGEexp <- setColors(exampleCAGEexp,
  c("salmon", "darkkhaki", "darkturquoise", "blueviolet", "blueviolet"))
exampleCAGEexp$grp <- c("a", "b", "b", "c", "c")
plotReverseCumulatives( exampleCAGEexp, fitInRange = c(5,100))
plotReverseCumulatives( exampleCAGEexp, values = "normalized"
                      , fitInRange = c(200, 2000), group = "sampleLabels")
plotReverseCumulatives( exampleCAGEexp[,4:5], fitInRange = c(5,100)) +
  ggplot2::ggtitle("prim6 replicates")
tagClustersGR(exampleCAGEexp) |> plotReverseCumulatives()


charles-plessy/CAGEr documentation built on Aug. 2, 2024, 4:35 p.m.