require(ggfastqc)
knitr::opts_chunk$set(
  comment = "#",
  error = FALSE,
  tidy = FALSE,
  cache = FALSE,
  collapse=TRUE)
# options(datatable.auto.index=FALSE)

ggfastqc package

The ggfastqc package allows quick summary plots of FastQC reports from Next Generation Sequencing data.

There are four functions for plotting various summary statistics:

The function fastqc() loads the entire report as an object of class fastqc which can be used to generate any additional plots that are required.


Load fastqc report

The fastqc() function loads data from FastQC generated reports via the argument sample_info which should be a file containing info about samples. The file should contain at least these three columns:

It can also optionally contain a group column. If present, the plots generated will take it into account and color / facet accordingly.

Note: {.bs-callout .bs-callout-warning}

It is recommended to have a group column.

Sample annotation file

path = system.file("tests/fastqc-sample", package="ggfastqc")
ann_file = file.path(path, "annotation.txt")
path = "./"
ann_file = file.path(path, "annotation.txt")

Here's how an annotation file might look like.

data.table::fread(ann_file)

Using fastqc() to load reports

obj = fastqc(ann_file)
obj
class(obj)

{.bs-callout .bs-callout-info}

GC stats

plot_gc_stats() provides a plot of GC percentage in each of the samples. By default the argument interactive = TRUE, in which case it will try to plot a jitter plot using the plotly package. Jitter plots are possible only when interactive = TRUE.

The other two types of plots possible are point and bar. Plots can be interactive or static for these two types of plots. If static, the function returns a ggplot2 plot.

Interative plot using plotly

plot_gc_stats(sample=obj)
pl = plot_gc_stats(sample=obj)
ll = htmltools::tagList()
ll[[1L]] = plotly::as.widget(pl)
ll

Note that the facet is automatically named sample which was the name provided to the input argument. More than one such fastqc object can be provided to a single function to generate facetted plot as shown above, for e.g., plot_gc_stats(s1 = obj1, s2 = obj2).

Using interactive=FALSE would result in a static ggplot2 plot, but jitter geom is not possible then.

Static plot using ggplot2

plot_gc_stats(sample=obj, interactive=FALSE, geom="point") # or "bar"

Sequence duplication percentage

plot_dup_stats() provides a plot of total reads sequenced. The usage is also identical to plot_gc_stats.

Interative plot using plotly

plot_dup_stats(sample=obj)
pl = plot_dup_stats(sample=obj)
ll = htmltools::tagList()
ll[[1L]] = plotly::as.widget(pl)
ll

Static plot using ggplot2

plot_dup_stats(sample=obj, interactive=FALSE, geom="point") # or "bar"

Total sequenced reads

plot_total_sequence_stats() provides a plot of total reads sequenced. The usage is also identical to plot_gc_stats.

Interative plot using plotly

plot_total_sequence_stats(sample=obj)
pl = plot_total_sequence_stats(sample=obj)
ll = htmltools::tagList()
ll[[1L]] = plotly::as.widget(pl)
ll

Static plot using ggplot2

plot_total_sequence_stats(sample=obj, interactive=FALSE, geom="bar") # or "point"

Per base sequence quality

plot_sequence_quality() provides a plot of per base sequence quality. The only geom implemented is line. Both interactive and non-interactive plots are possible, as shown below.

Interative plot using plotly

plot_sequence_quality(sample=obj)
pl = plot_sequence_quality(sample=obj)
ll = htmltools::tagList()
ll[[1L]] = plotly::as.widget(pl)
ll

Static plot using ggplot2

plot_sequence_quality(sample=obj, interactive=FALSE)



openanalytics/ggfastqc documentation built on May 24, 2019, 2:27 p.m.