plot.density.differences: Plot the distribution of overall NGS density at specific...

View source: R/plot.density.differences.R

plot.density.differencesR Documentation

Plot the distribution of overall NGS density at specific regions from deepTools matrices.

Description

Computes the score of each element in a list of regions and generates violins plots with percentiles and the mean (optional) for each sample/region. It uses as input a score matrix computed by deeptools's computeMatrix function or by computeMatrix.deeptools and density.matrix functions from this package.

Usage

## S3 method for class 'density.differences'
plot(
  matrix.file,
  missing.data.as.zero = NULL,
  sample.names = NULL,
  region.names = NULL,
  signal.type = "mean",
  error.type = "sem",
  subset.range = NULL,
  inverted.comparisons = F,
  stat.method = "wilcox.test",
  stat.paired = T,
  stat.p.levels = list(cutpoints = c(0, 1e-04, 0.001, 0.01, 0.05, 1), symbols = c("****",
    "***", "**", "*", "ns")),
  area.line.width = 0.5,
  area.fill.area = T,
  area.plot.zero.line = T,
  area.y.identical.auto = T,
  area.y.ticks.interval = NULL,
  area.y.digits = 1,
  correlation.log2 = T,
  correlation.plot.correlation = T,
  correlation.correlation.method = "lm",
  correlation.show.equation = T,
  correlation.correlation.line.width = 0.75,
  correlation.correlation.line.color = "purple",
  correlation.correlation.line.type = 1,
  correlation.correlation.line.SE = T,
  correlation.correlation.formula = "y ~ x",
  correlation.add.rug = T,
  correlation.x.identical.auto = T,
  correlation.y.identical.auto = T,
  correlation.x.ticks.interval = NULL,
  correlation.y.ticks.interval = NULL,
  correlation.x.digits = 1,
  correlation.y.digits = 1,
  points.size = 0.5,
  transparency = 0.25,
  axis.line.width = 0.5,
  text.size = 12,
  legend.position = c(0.2, 0.85),
  colors = c(Sample1 = "#F8766D", Sample2 = "#00A5CF", `No difference` = "#00BA38"),
  n.row.multiplot = 1,
  by.row = T
)

Arguments

matrix.file

A single string indicating a full path to a matrix.gz file generated by deepTools/computeMatrix or by computeMatrix.deeptools, or a list generated by the function read.computeMatrix.file or density.matrix.

missing.data.as.zero

Logical value to define whether treat missing data as 0. If set as FALSE missing data will be converted to NA and will be excluded from the computations of the signal. By default TRUE.

sample.names

Samples names could be defined by a string vector. If set as NULL sample names will be get automatically by the matrix file. By default NULL.
Example: c("sample1", "sample2", "sample3")

region.names

Region names could be defined by a string vector. If set as NULL sample names will be get automatically by the matrix file. By default NULL.
Example: c("regionA", "regionB")

signal.type

String indicating the signal to be computed and plotted/compared. Available parameters are "mean", "median" and "sum". By default "mean".

error.type

String indicating the type of error to be computed and that will be available in the output data.table. Available parameters are "sem" and "sd", standard error mean and standard deviation respectively. By default "sem". Parameter considered only when show.mean = TRUE).

subset.range

A numeric vector indicating the range to which restrict the analyses (eg. c(-150, 250)). In the case of "scale-region" mode, the range is represented by (-upstream | 0 | body_length | body_length+downstream).By default NULL: the whole region is considered.

inverted.comparisons

Logical value to indicate whether to invert the order of the pair-comparisons. By default FALSE.

stat.method

A single string defining the method to use for the statistical comparisons. By default "wilcox.test". Available options: "t.test" "wilcox.test".

stat.paired

Logical value to define if the statistical comparisons should be performed paired. By default TRUE. Notice that to allow a paired comparison the number of data should be the same in the two groups compared, so in the most of the cases non applicable to the comparisons between two regions. Used only in "t.test" and "wilcox.test" methods.

stat.p.levels

A list containing the p-values levels/thresholds in the following format (default): list(cutpoints = c(0, 0.0001, 0.001, 0.01, 0.05, 1), symbols = c("****", "***", "**", "*", "ns")). In other words, we use the following convention for symbols indicating statistical significance:

  • ns: p > 0.05

  • * p <= 0.05

  • ** p <= 0.01

  • *** p <= 0.001

  • **** p <= 0.0001

area.line.width

Numeric value to define width of the line connecting the points in the area.plots. By default 0.5.

area.fill.area

Logical value to indicate whether to fill the area under the line in the area.plot. By default TRUE.

area.plot.zero.line

Logical value to define whether to plot a dashed gray vertical line in correspondence of the 0 of each area.plot. By default TRUE.

area.y.identical.auto

Logical value to define whether use the same Y-axis range for all the area.plots automatically depending on their values. By default TRUE.

area.y.ticks.interval

A number indicating the interval/bin spacing two ticks on the Y-axis of area.plots. By default NULL: ticks are assigned automatically.

area.y.digits

Numeric value defining the number of digits to use for the Y-axis values of area.plots. By default 1 (eg. 1.5).

correlation.log2

Logical value to define whether the correlation.plots should show the log2 value of the score. By default TRUE.

correlation.plot.correlation

Local value to indicate whether to plot the correlation curve on the correlation.plot. By default TRUE.

correlation.correlation.method

Atomic string describing the method to use to compute the regression curve, eg. "lm", "glm", "gam", "loess", "rlm". By default 'lm'.

correlation.show.equation

= T

correlation.correlation.line.width

Numeric value to define correlation line width for all correlation.plots. By default 0.75.

correlation.correlation.line.color

Numeric value to define correlation line width for all correlation.plots. By default "purple".

correlation.correlation.line.type

A numeric or character value to define the correlation line type. Both numeric and string codes are accepted. By default "solid".

correlation.correlation.line.SE

Logical value to indicate whether to plot the standard error (SE) of the correlation curve in the correlation.plot. By default TRUE.

correlation.correlation.formula

Atomic string indicating the formula to use to compute the correlation curve. By default "y ~ x".

correlation.add.rug

Logical value to indicate whether to add a rug representation (1-d plot) of the data to the correlation.plot. By default TRUE.

correlation.x.identical.auto

Logical value to define whether use the same X-axis range for all the correlation.plots automatically depending on their values. By default TRUE.

correlation.y.identical.auto

Logical value to define whether use the same Y-axis range for all the correlation.plots automatically depending on their values. By default TRUE.

correlation.x.ticks.interval

A number indicating the interval/bin spacing two ticks on the X-axis of correlation.plots. By default NULL: ticks are assigned automatically.

correlation.y.ticks.interval

A number indicating the interval/bin spacing two ticks on the Y-axis of correlation.plots. By default NULL: ticks are assigned automatically.

correlation.x.digits

Numeric value defining the number of digits to use for the X-axis values of correlation.plots. By default 1 (eg. 1.5).

correlation.y.digits

Numeric value defining the number of digits to use for the Y-axis values of correlation.plots. By default 1 (eg. 1.5).

points.size

A numeric value defining the size of the points in both area and correlation plot. By default 0.5.

transparency

A numeric value to define the fraction of transparency of the fill area in the area.plot and the SE in the correlation plot (0 = transparent, 1 = full). By default 0.25.

axis.line.width

Numeric value to define the axes and ticks line width for all plots. By default 0.5.

text.size

Numeric value to define the size of the text for the labels of all the plots. By default 12.

legend.position

Any ggplot supported value for the legend position (eg. "none, "top", "bottom", "left", "right", c(fraction.x, fraction.y)). By default c(0.2, 0.85).

colors

Vector of 3 elements to define the points and area colors ('Sample1', 'Sample2' and, 'No difference' values respectively). If only one value is provided it will applied to all the samples. If the number of values is less then 3, the default color set will be used. All supported R.colors values are accepted. By default c("Sample1" = "#F8766D", "Sample2" = "#00A5CF", "No difference" = "#00BA38").

n.row.multiplot

Numeric value to define the number of rows in the final multiplot.

by.row

Logical value to define whether the plots should be arranged by row. By default TRUE.

Details

To know more about the deepTools's function computeMatrix see the package manual at the following link:
https://deeptools.readthedocs.io/en/develop/content/tools/computeMatrix.html.

Value

The function returns a list containing:

  • data.table with the computed values with all groups and all samples;

  • metadata table with the information obtained from the matrix_file.gz;

  • comparison.table.list with a list of tables for each group with a table per each comparison containing the original data and the compared values (differences);

  • comparison.statistics.table with a table with all the statistical comparisons;

  • area.plot.byGroup.list with a list per group with a all the area.plots of each comparison;

  • correlation.plot.byGroup.list with a list per group with a all the correlation.plots of each comparison;

  • area.multiplot.list with an area.multiplot per each group;

  • correlation.multiplot.list with an correlation.multiplot per each group.


sebastian-gregoricchio/Rseb documentation built on May 15, 2024, 5:45 a.m.