plotOverrep-methods: Plot a summary of Over-represented Sequences

Description Usage Arguments Details Value Examples

Description

Plot a summary of Over-represented Sequences for a set of FASTQC reports

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
plotOverrep(x, usePlotly = FALSE, labels, pwfCols, ...)

## S4 method for signature 'ANY'
plotOverrep(x, usePlotly = FALSE, labels, pwfCols, ...)

## S4 method for signature 'character'
plotOverrep(x, usePlotly = FALSE, labels, pwfCols, ...)

## S4 method for signature 'FastqcData'
plotOverrep(
  x,
  usePlotly = FALSE,
  labels,
  pwfCols,
  n = 10,
  ...,
  expand.x = expansion(mult = c(0, 0.05)),
  expand.y = expansion(0, 0.6)
)

## S4 method for signature 'FastqcDataList'
plotOverrep(
  x,
  usePlotly = FALSE,
  labels,
  pwfCols,
  cluster = FALSE,
  dendrogram = FALSE,
  ...,
  paletteName = "Set1",
  expand.x = expansion(mult = c(0, 0.05)),
  expand.y = expansion(0, 0)
)

Arguments

x

Can be a FastqcData, FastqcDataList or file paths

usePlotly

logical Default FALSE will render using ggplot. If TRUE plot will be rendered with plotly

labels

An optional named factor of labels for the file names. All filenames must be present in the names. File extensions are dropped by default.

pwfCols

Object of class PwfCols containing the colours for PASS/WARN/FAIL

...

Used to pass additional attributes to theme() and between methods

n

The number of sequences to plot from an individual file

expand.x, expand.y

Output from expansion() or numeric vectors of length 4. Passed to scale_*_continuous()

cluster

logical default FALSE. If set to TRUE, fastqc data will be clustered using hierarchical clustering

dendrogram

logical redundant if cluster is FALSE if both cluster and dendrogram are specified as TRUE then the dendrogram will be displayed.

paletteName

Name of the palette for colouring the possible sources of the overrepresented sequences. Must be a palette name from RColorBrewer

Details

Percentages are obtained by simply summing those within a report. Any possible double counting by FastQC is ignored for the purposes of a simple approximation.

Plots generated from a FastqcData object will show the top n sequences grouped by their predicted source & coloured by whether the individual sequence would cause a WARN/FAIL.

Plots generated from a FastqcDataList group sequences by predicted source and summarise as a percentage of the total reads.

Value

A standard ggplot2 object

Examples

1
2
3
4
5
6
7
8
9
# Get the files included with the package
packageDir <- system.file("extdata", package = "ngsReports")
fl <- list.files(packageDir, pattern = "fastqc.zip", full.names = TRUE)

# Load the FASTQC data as a FastqcDataList object
fdl <- FastqcDataList(fl)

# Another example which isn't ideal
plotOverrep(fdl)

ngsReports documentation built on Nov. 23, 2020, 2:01 a.m.