plot.density.summary: Plot the distribution of overall NGS density at specific...

View source: R/plot.density.summary.R

plot.density.summaryR Documentation

Plot the distribution of overall NGS density at specific regions from deepTools matrices.

Description

Computes the score of each element in a list of regions and generates violins plots with percentiles and the mean (optional) for each sample/region. It uses as input a score matrix computed by deeptools's computeMatrix function or by computeMatrix.deeptools and density.matrix functions from this package.

Usage

## S3 method for class 'density.summary'
plot(
  matrix.file,
  plot.by.group = T,
  missing.data.as.zero = NULL,
  sample.names = NULL,
  region.names = NULL,
  signal.type = "mean",
  linear = F,
  error.type = "sem",
  show.mean = T,
  mean.error.type = "se",
  mean.color = "blue",
  mean.symbol.shape = 20,
  mean.symbol.size = 1,
  show.stat.multiplot = T,
  stat.method = "wilcox.test",
  stat.paired = F,
  stat.labels.format = "p.signif",
  stat.hide.ns = T,
  stat.p.levels = list(cutpoints = c(0, 1e-04, 0.001, 0.01, 0.05, 1), symbols = c("****",
    "***", "**", "*", "ns")),
  title = NULL,
  x.lab = NULL,
  y.lab = NULL,
  x.labs.angle = 0,
  dodge.width = 1,
  border.width = 0.5,
  border.color = "#000000",
  transparency = 0.5,
  subset.range = NULL,
  y.lim = NULL,
  y.identical.auto = T,
  y.ticks.interval = NULL,
  y.digits = 1,
  axis.line.width = 0.5,
  text.size = 12,
  legend.position = c(0.2, 0.85),
  colors = c("#00A5CF", "#F8766D", "#AC88FF", "#E08B00", "#00BA38", "#BB9D00", "#FF61C9",
    "gray30"),
  n.row.multiplot = 1,
  multiplot.export.file = NULL,
  real.width.single.violinplot = 1,
  real.height.single.violinplot = 3.5,
  by.row = TRUE,
  print.multiplot = F
)

Arguments

matrix.file

A single string indicating a full path to a matrix.gz file generated by deepTools/computeMatrix or by computeMatrix.deeptools, or a list generated by the function read.computeMatrix.file or density.matrix.

plot.by.group

Logical value to define whether plot by group of regions or by sample. By default TRUE.

missing.data.as.zero

Logical value to define whether treat missing data as 0. If set as FALSE missing data will be converted to NA and will be excluded from the computations of the signal. By default TRUE.

sample.names

Samples names could be defined by a string vector. If set as NULL sample names will be get automatically by the matrix file. By default NULL.
Example: c("sample1", "sample2", "sample3")

region.names

Region names could be defined by a string vector. If set as NULL sample names will be get automatically by the matrix file. By default NULL.
Example: c("regionA", "regionB")

signal.type

String indicating the signal to be computed and plotted. Available parameters are "mean", "median" and "sum". By default "mean".

linear

Logical value to define whether the plots should show the score in linear scale. By default FALSE.

error.type

String indicating the type of error to be computed and that will be available in the output data.table. Available parameters are "sem" and "sd", standard error mean and standard deviation respectively. By default "sem". Parameter considered only when show.mean = TRUE).

show.mean

Logical value to define whether the mean value should be shown as a symbol on the plots. By default TRUE.

mean.error.type

String indicating the type of error for the mean to be computed. Available parameters are "se", "sd" and, "none". Respectively standard error, standard deviation, and no error plotted. By default "se". Parameter considered only when show.mean = TRUE).

mean.color

A single string expressing an R-supported color for the mean symbol. By default "blue".

mean.symbol.shape

A numeric value or string defining the shape for the mean symbol. By default 20.

mean.symbol.size

A numeric value defining the size of the mean symbol. By default 1.

show.stat.multiplot

Logical value to define if to add to the plot the statistical comparisons of the means for the groups present in the multiplot. By default TRUE. All possibile comparisons will be performed.

stat.method

A single string defining the method to use for the statistical comparisons. By default "wilcox.test". Available options: "t.test" "wilcox.test".

stat.paired

Logical value to define if the statistical comparisons should be performed paired. By default "FALSE". Notice that to allow a paired comparison the number of data should be the same in the two groups compared, so in the most of the cases non applicable to the comparisons between two regions. Used only in "t.test" and "wilcox.test" methods.

stat.labels.format

A single string indicating the format of the p-value to show for the statistical comparisons. By default "p.signif". Available options: "p.format" (normal p-value), "p.signif" (significance stars), "p.adj" (p-value adjusted).

stat.hide.ns

Logical value indicating if the NS ("Not Significant") comparisons should be shown or not. By default TRUE.

stat.p.levels

A list containing the p-values levels/thresholds in the following format (default): list(cutpoints = c(0, 0.0001, 0.001, 0.01, 0.05, 1), symbols = c("****", "***", "**", "*", "ns")). In other words, we use the following convention for symbols indicating statistical significance:

  • ns: p > 0.05

  • * p <= 0.05

  • ** p <= 0.01

  • *** p <= 0.001

  • **** p <= 0.0001

title

Title of each plot could be defined by a string vector. If set as NULL titles will be generated automatically. By default NULL.
Example: c("Title1", "Title2")

x.lab

Single string or string vector to define the X-axis label for all the plots. By default NULL, the label will be defined automatically.

y.lab

Single string or string vector to define the Y-axis label for all the plots. By default NULL, the label will be defined automatically.

x.labs.angle

A single numeric value indicating the degrees of rotation of the category labels in the X-axis. By default 0, horizontal without rotation.

dodge.width

Numeric value defining the width of each single violin plot. By default 1.

border.width

Numeric value to define the border width for all the violin plots. By default 0.5.

border.color

A single string indicating the color to use for the border of the violin plots. By default "#000000" (full black).

transparency

A numeric value to define the fraction of transparency of the plots fill (0 = transparent, 1 = full). By default 0.5.

subset.range

A numeric vector indicating the range to which restrict the analyses (eg. c(-150, 250)). In the case of "scale-region" mode, the range is represented by (-upstream | 0 | body_length | body_length+downstream).By default NULL: the whole region is considered.

y.lim

List of numeric vectors with two elements each to define the range of the Y-axis. To set only one side use NA for the side to leave automatic. If only one range is given this one will be applied to all the plots. By default NULL, the range will be defined automatically.
Example list(c(0, 20), c(NA, 30), c(0, NA), c(NA, NA)).,

y.identical.auto

Logical value to define whether use the same Y-axis range for all the plots automatically depending on the values. Not used when y.lim is not NULL. By default TRUE.

y.ticks.interval

A number indicating the interval/bin spacing two ticks on the Y-axis. By default NULL: ticks are assigned automatically. Active only when y.identical.auto = TRUE and y.lim != NULL.

y.digits

A numeric value to define the number of digits to use for the y.axis values. By default 1 (eg. 1.5).

axis.line.width

Numeric value to define the axes and ticks line width for all plots. By default 0.5.

text.size

Numeric value to define the size of the text for the labels of all the plots. By default 12.

legend.position

Any ggplot supported value for the legend position (eg. "none, "top", "bottom", "left", "right", c(fraction.x, fraction.y)). By default c(0.2, 0.85).

colors

Vector to define the line and error area colors. If only one value is provided it will applied to all the samples/groups. If the number of values is lower than the the required one, a random set of colors will be generated. All standard R.colors values are accepted. By default c("#00A5CF", "#F8766D", "#AC88FF", "#E08B00", "#00BA38", "#BB9D00", "#FF61C9", "gray30").

n.row.multiplot

Numeric value to define the number of rows in the final multiplot.

multiplot.export.file

If a string with the name of a PDF file is provided the multiplot will be exported. By default NULL.

real.width.single.violinplot

Numeric value, in inches, to define the real width (not precise) of each single violin plot in the multiplot exported, if required. By default 1 inch.

real.height.single.violinplot

Numeric value, in inches, to define the real height (not precise) of each single violin plot in the multiplot exported, if required. By default 3.5 inches.

by.row

Logical value to define whether the plots should be arranged by row. By default TRUE.

print.multiplot

Logical value to define whether to print the multiplot once generated. By default FALSE.

Details

To know more about the deepTools's function computeMatrix see the package manual at the following link:
https://deeptools.readthedocs.io/en/develop/content/tools/computeMatrix.html.

Value

The function returns a list containing:

  • data.table with the computed values used for the plot;

  • metadata table with the information obtained from the matrix_file.gz;

  • plot.list with a plot for each list element;

  • density.profile with the density profile of the mean signal generated by plot.density.profile corresponding to the regions/samples for which the summary multiplot have been generated;

  • multiplot with the image of all the plots together;

  • summary.plot.samples with a plot showing the scores of all regions per each sample;

  • summary.plot.regions with a plot showing the scores of all samples per each region;

  • means.comparisons table with the statistical means comparisons (when show.stat.multiplot = TRUE, otherwise a string is returned).


sebastian-gregoricchio/Rseb documentation built on May 15, 2024, 5:45 a.m.