qualcontour: Create read quality color contour plots
In theseus: Analysis and Visualization Tools for Microbial Community Data

Description Usage Arguments Details Value See Also Examples

This function generates a 2-D color/contour map representing the average quality scores by location (read cycle number) for a designated percentile. It is intended to assist the user with deciding where trimming should be performed.

1
2
3

qualcontour(f_path, r_path, idx, percentile = 0.25, amp_length, min_overlap,
  n_samples = 12, q = c(25, 30, 35), bins = 50, nc = 1,
  seed = sample.int(.Machine$integer.max, 1), verbose = FALSE)

`f_path`	(required) A character vector locating the forward read (Read 1) .fastq files
`r_path`	(required) A character vector locating the reverse read (Read 2) .fastq files
`idx`	Indexes (within f_path and r_path) identifying specific .fastq files to be used for analysis
`percentile`	The percentile to be targeted . Defaults to .25 (i.e. the first quartile).
`amp_length`	Intra-primer amplicon length. Calculated distance in base-pairs between primers. Used to determine region of no overlap. Both 'amp_length' and 'min_overlap' must be provided for these calculations.
`min_overlap`	The minimum amount of overlap between the two reads. Used to determine region of no overlap. Both 'amp_length' and 'min_overlap' must be provided for these calculations.
`n_samples`	Integer indicating the number of samples to include in the visualization. Defaults to 12.
`q`	A numeric vector designating Phred quality scores to be represented on the plot. Defaults to 25, 30, and 35.
`bins`	Integer designating the number of bins each read should be separated into. For example, visualizing a 250 bp read with 50 bins would imply that each bin represents 5 cycles/bp. Increasing the number of bins improves granularity at the cost of memory and processing speed. Defaults to 50.
`nc`	The number of cores to use when multithreading. Defaults to 1.
`seed`	An integer value to be used when randomly selecting the subset of samples to be visualized.
`verbose`	If set to TRUE, provides verbose output. Defaults to FALSE.

qualcontour's (quality contour) two required arguments are character vectors of the file paths for forward ('f_path') and reverse ('r_path') reads. qualcontour tabulates the distribution of quality scores at each read cycle for the forward and reverse reads independently and then averages (arithmetic mean) the quality scores for each (forward/reverse) cycle combination. These values are then plotted as a ggplot2 object. Users can (re)run 'qualcontour' with different 'percentile' values to visualize how the quality scores varies in shape. plotQualityProfile in the 'dada2' package provides an elegant way of looking at the quality profiles for the forward or reverse reads.

A ggplot object with the following attributes:

idx: Samples used to generate the plot.
amp_length: Value for amp_length used to generate the plot.
min_overlap: Value for min_overlap used to generate the plot.
seed: Seed used to select the samples used to generate the plot.

qa plotQualityProfile

## Not run: 
library(theseus)
library(ggplot2)
fns <- sort(list.files(file.path(system.file(package='theseus'),
            '/testdata/'), full.names=TRUE))
f_path <- fns[grepl('R1.fastq.gz', fns)]
r_path <- fns[grepl('R2.fastq.gz', fns)]
p.qc <- qualcontour(f_path, r_path, n_samples=2, verbose=TRUE,
                    percentile=.25, nc=1)
p.qc
p.qc + geom_hline(yintercept=175) + geom_vline(xintercept=275)

## End(Not run)