plot_metrics: Plot RNAseq metrics

View source: R/plot_metrics.R

plot_metricsR Documentation

Plot RNAseq metrics

Description

This function generates plots of several common metrics (total reads, coverage) against each other for a set of libraries. It plots horizontal and vertical lines at the standard QC thresholds for each metric. In addition, it plots lines at standard outlier thresholds, and labels points beyond these thresholds with the library identifiers. Points are optionally colored by values of a discrete or continuous variable from design specified by by_var. Some options still need to be built in, e.g. plotting names of all libraries, modifying threshold values for each metric, etc.

Usage

plot_metrics(
  metrics,
  metrics.libID_col = "lib.id",
  design = NULL,
  design.libID_col = "lib.id",
  threshold.perc_aligned = 0.8,
  column.perc_aligned = "mapped_reads_w_dups",
  threshold.total_reads = 5,
  column.total_reads = "pf_hq_aligned_reads",
  threshold.median_cv_coverage = 1,
  column.median_cv_coverage = "median_cv_coverage",
  by_var = NULL,
  by_var_levels = NULL,
  by_var_lab = NULL,
  my_cols = c("blue", "red"),
  na_col = "grey50",
  point_names = "thresholded",
  point_size = 1,
  plot_outlier_lines = TRUE,
  file_prefix = NULL,
  plotdims = c(9, 9)
)

Arguments

metrics

matrix or data frame containing values of metrics. Should have metrics in columns and libraries in rows.

metrics.libID_col

name or number of the column in metrics containing the library identifiers.

design

matrix or data frame containing the library information. Should have variables in columns and libraries in rows.

design.libID_col

name or number of the column in design containing the library identifiers.

threshold.perc_aligned

numeric, the threshold for percent of reads aligned. Libraries with values below this threshold are flagged. Defaults to 0.8. Set to NULL to remove threshold.

column.perc_aligned

the name or number of the column in metrics containing the percent of reads aligned. Defaults to "mapped_reads_w_dups".

threshold.total_reads

numeric, the threshold for total reads, in millions. Libraries with values below this threshold are flagged. Defaults to 5. Set to NULL to remove threshold.

column.total_reads

the name or number of the column in metrics containing the total reads, assumed to be in millions of reads. Defaults to "pf_hq_aligned_reads".

threshold.median_cv_coverage

numeric, the threshold for median CV coverage. Libraries with values above this threshold are flagged. Defaults to 1. Set to NULL to remove threshold.

by_var

(optional) character string or integer identifying the column in design to color points by. If not provided, points are plotted in black.

by_var_levels

(optional) character vector defining the order of elements in the variable used for coloring points; this order is used for the plot legend and to match the order of colors (if provided). If not provided, levels of the variable are ordered by order of appearance in the design object.

by_var_lab

(optional) string to be used as the title for the color legend.

my_cols

(optional) vector of colors to use for plotting. If by_var is numeric, should have two elements, providing the start and end points for a continuous color scale (generated by scale_color_gradient). If by_var is not numeric, should be a vector with one color for each level of by_var; if the number of values supplied is less than the numer of levels in by_var, additional values are interpolated using colorRampPalette. By default, uses a range from blue to red.

na_col

color to use for NA values of by_var.

point_names

the points to label in the plot. Defaults to "thresholded", which selects the points outside the thresholds. Can be a character vector of library IDs to plot. Set to NULL to remove all point labels.

point_size

numeric, the size of the points to be plotted. Defaults to 1.

plot_outlier_lines

logical, whether to plot the lines where points would be considered outliers, based on the Q1-1.5*IQR / Q3+1.5*IQR threshold. Defaults to TRUE.

file_prefix

a character string. If provided, the function outputs pdfs of the plots, named "file_prefix_plot_name.pdf". If not provided, the function prints to a plotting window.

plotdims

a numeric vector, the size (in inches) of the plotting object. Either the size of the pdfs, or the size of the plotting windows.

column.median_cv

the name or number of the column in metrics containing the median_CV_coverage. Defaults to "median_cv_coverage".


mjdufort/RNAseQC documentation built on April 19, 2024, 3:13 p.m.