Description Usage Arguments Details Value See Also Examples
This function generates a 2-D color/contour map representing the average quality scores by location (read cycle number) for a designated percentile. It is intended to assist the user with deciding where trimming should be performed.
1 2 3 | qualcontour(f_path, r_path, idx, percentile = 0.25, amp_length, min_overlap,
n_samples = 12, q = c(25, 30, 35), bins = 50, nc = 1,
seed = sample.int(.Machine$integer.max, 1), verbose = FALSE)
|
f_path |
(required) A character vector locating the forward read (Read 1) .fastq files |
r_path |
(required) A character vector locating the reverse read (Read 2) .fastq files |
idx |
Indexes (within f_path and r_path) identifying specific .fastq files to be used for analysis |
percentile |
The percentile to be targeted . Defaults to .25 (i.e. the first quartile). |
amp_length |
Intra-primer amplicon length. Calculated distance in base-pairs between primers. Used to determine region of no overlap. Both 'amp_length' and 'min_overlap' must be provided for these calculations. |
min_overlap |
The minimum amount of overlap between the two reads. Used to determine region of no overlap. Both 'amp_length' and 'min_overlap' must be provided for these calculations. |
n_samples |
Integer indicating the number of samples to include in the visualization. Defaults to 12. |
q |
A numeric vector designating Phred quality scores to be represented on the plot. Defaults to 25, 30, and 35. |
bins |
Integer designating the number of bins each read should be separated into. For example, visualizing a 250 bp read with 50 bins would imply that each bin represents 5 cycles/bp. Increasing the number of bins improves granularity at the cost of memory and processing speed. Defaults to 50. |
nc |
The number of cores to use when multithreading. Defaults to 1. |
seed |
An integer value to be used when randomly selecting the subset of samples to be visualized. |
verbose |
If set to TRUE, provides verbose output. Defaults to FALSE. |
qualcontour's (quality contour) two required arguments are character
vectors of the file paths for forward ('f_path') and reverse ('r_path')
reads. qualcontour tabulates the distribution of quality scores at each
read cycle for the forward and reverse reads independently and then
averages (arithmetic mean) the quality scores for each (forward/reverse)
cycle combination. These values are then plotted as a ggplot2 object. Users
can (re)run 'qualcontour' with different 'percentile' values to visualize
how the quality scores varies in shape.
plotQualityProfile
in the 'dada2' package provides an
elegant way of looking at the quality profiles for the forward or reverse
reads.
A ggplot object with the following attributes:
Samples used to generate the plot.
Value for amp_length used to generate the plot.
Value for min_overlap used to generate the plot.
Seed used to select the samples used to generate the plot.
1 2 3 4 5 6 7 8 9 10 11 12 13 | ## Not run:
library(theseus)
library(ggplot2)
fns <- sort(list.files(file.path(system.file(package='theseus'),
'/testdata/'), full.names=TRUE))
f_path <- fns[grepl('R1.fastq.gz', fns)]
r_path <- fns[grepl('R2.fastq.gz', fns)]
p.qc <- qualcontour(f_path, r_path, n_samples=2, verbose=TRUE,
percentile=.25, nc=1)
p.qc
p.qc + geom_hline(yintercept=175) + geom_vline(xintercept=275)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.