QCplots: Function QCplots

Description Usage Arguments Value Author(s) Examples

View source: R/QCplots.R

Description

This function takes a dataframe with metric names in the first column and samples in col2-n. Each row is a different QC metric. It returns a list of qc plots as defined by other arguments. If only one metric is plotted a ggplot object is returned. By default, horizontal reference lines are drawn at the median and +/- n SDs based on the hlineSD argument. These are statistical reference points, NOT pass/fail limits.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
QCplots(
  qcdata,
  metricNames,
  sampleNames,
  plotType = "bar",
  barColor = "dodgerblue4",
  barFill = "dodgerblue3",
  barSize = 0.1,
  barAlpha = 1,
  barWidth = 0.9,
  pointColor = "dodgerblue4",
  pointFill = "dodgerblue3",
  pointShape = 21,
  pointAlpha = 1,
  pointSize = 4,
  lineColor = "dodgerblue4",
  lineSize = 1,
  lineType = "solid",
  lineAlpha = 1,
  histColor = "dodgerblue4",
  histFill = "dodgerblue3",
  histSize = 1,
  histAlpha = 1,
  xAngle = 90,
  baseTextSize = 14,
  hlineSD = 3,
  winsorize = TRUE,
  debug = FALSE
)

Arguments

qcdata

A dataframe or tibble with metric names in the first column and samples in columns 2-n. Each row is a different QC metric. This matches the Omicsoft RNA-Seq.QCMetrics.Table.txt format. (required)

metricNames

A list of metrics to plot. Values must exist in column 1 of the data frame. (required)

sampleNames

By default will use the samplenames in qcdata (colnames). Optionally use this argument to provide different sample names. Supply the alternative samplenames here in the order they appear in the qcdata data.frame (columns 2-n). The plot order is based on an alphabetical sort of the original column names so might be different from the order in the supplied data.frame.

plotType

One of "bar", "point", "pointline". If you want a different plottype for each metric, pass a list of plotTypes with length equal to length(metricNames) (default="bar")

barColor

Color for the bar outline (default = "dodgerblue4")

barFill

Color for the bar area (default = "dodgerblue3")

barSize

set the bar size (thickness of each bar perimeter; default = 0.1)

barAlpha

Transparency for the bar layer (Range = 0-1) (Default = 1)

barWidth

set the bar width (Default = 0.8)

pointColor

Color for the point layer (Default = "grey30")

pointFill

Fill color for the point layer (Default = "dodgerblue4")

pointShape

Shape for the point layer (Default = 21; fillable circle)

pointAlpha

Transparency for the box layer (Range = 0-1) (Default = 1)

pointSize

Size of the points (Default = 4)

lineColor

Color of the line (Default = "dodgerblue4")

lineSize

Size of the line fit (Default = 1)

lineType

One of c("solid", "dashed", "dotted", "dotdash", "longdash", "twodash"). (Default = "solid")

lineAlpha

Transparency for the line layer (Range = 0-1) (default=1)

histColor

Outline color for the histogram (Default = "dodgerblue4")

histFill

Fill color for the histogram (Default = "dodgerblue3")

histSize

Thickness of the bar borders (Default = 1)

histAlpha

Transparency of the histogram (Derault = 1)

xAngle

Angle to set the sample labels on the Xaxis (Default = 90; Range = 0-90)

baseTextSize

default = 14

hlineSD

Draw two reference lines 1) at the median value 2) the number of SDs defined by the value of hlineSD. (default=3; 0 to disable the reference lines).

winsorize

This implements a robust method to calculate standard deviations. It is used to calculate the standard deviation for the placement of horizontal reference lines (hlineSD argument). The adaptive winsorization used here only trims extreme values when normality is violated. see https://www.r-bloggers.com/winsorization/ for details. (Default=TRUE).

Value

ggplot object if one plot is specified. A list of ggplot objects if 2 or more metrics specified

Author(s)

John Thompson, jrt@thompsonclan.org

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  Example 1:
  # Get QC data from an Omicsoft project in S3
  S3mount <- "/arrayserver" #where you mount the S3 bucket "bmsrd-ngs-arrayserver"
  s3path <- "/OmicsoftHome/output/P-20180326-0001/TempleUniv_HeartFailure2017_P-20180326-0001_R94_24Jan2019/ExportedViewsAndTables"
  qcfilename <- "RNA-Seq.QCMetrics.Table.txt" #standard name for QC file in Omicosoft projects
  qcdata <- readr::read_delim(file.path(s3path, qcfilename), delim="\t")
  colnames(qcdat) <- stringr::str_sub(colnames(qcdat), 1, 16) #shorten the samplenames

  # pick some Omicsoft Metrics from column 1 of the data frame
  someFavMetrics <- c("Alignment_MappedRate", "Alignment_PairedRate",
                              "Source_rRNA", "Strand_Read1AntiSense",
                              "Strand_ReadPairAntiSense", "Profile_ExonRate",
                              "Profile_InterGene_FPK")

  MyQCplots <- QCplots(qcdata, metricNames=someFavMetrics) #all defaults
  # draw the first plot
  print(MyQCplots[[1]])

  Example 2:
  # Get QC data from an Xpress Project
  plotdat <- Xpress2R::getXpressQC(xid="20261", level = "rn6ERCC-ensembl82-genes")
  myMetrics <- c("QC_CodingBases", "QC_EstimatedLibrarySize")
  p <- QCplots(plotdat, metricNames=myMetrics)
  # draw the first plot
  print(p[[1]])

jrthompson54/DGE.Tools2 documentation built on May 12, 2021, 8:47 p.m.