plotComplexity: Plot sequence complexity profile of a fastq file.

View source: R/plot-methods.R

plotComplexityR Documentation

Plot sequence complexity profile of a fastq file.

Description

This function plots a histogram of the distribution of sequence complexities in the form of effective numbers of kmers as determined by seqComplexity. By default, kmers of size 2 are used, in which case a perfectly random sequences will approach an effective kmer number of 16 = 4 (nucleotides) ^ 2 (kmer size).

Usage

plotComplexity(
  fl,
  kmerSize = 2,
  window = NULL,
  by = 5,
  n = 1e+05,
  bins = 100,
  aggregate = FALSE,
  ...
)

Arguments

fl

(Required). character. File path(s) to fastq or fastq.gz file(s).

kmerSize

(Optional). Default 2. The size of the kmers (or "oligonucleotides" or "words") to use.

window

(Optional). Default NULL. The width in nucleotides of the moving window. If NULL the whole sequence is used.

by

(Optional). Default 5. The step size in nucleotides between each moving window tested.

n

(Optional). Default 100,000. The number of records to sample from the fastq file.

bins

(Optional). Default 100. The number of bins to use for the histogram.

aggregate

(Optional). Default FALSE. If TRUE, compute an aggregate quality profile for all fastq files provided.

...

(Optional). Arguments passed on to geom_histogram.

Value

A ggplot2 object. Will be rendered to default device if printed, or can be stored and further modified. See ggsave for additional options.

See Also

seqComplexity oligonucleotideFrequency

Examples

plotComplexity(system.file("extdata", "sam1F.fastq.gz", package="dada2"))


benjjneb/dada2 documentation built on Dec. 5, 2024, 4:02 p.m.