Perform quality assessment on short reads

Share:

Description

This function is a common interface to quality assessment functions available in ShortRead. Results from this function may be displayed in brief, or integrated into reports using, e.g., report.

Usage

1
2
3
4
5
6
7
8
qa(dirPath, ...)
## S4 method for signature 'character'
qa(dirPath, pattern=character(0), 
    type=c("fastq", "SolexaExport", "SolexaRealign", "Bowtie",
           "MAQMap", "MAQMapShort"),
    ...)
## S4 method for signature 'list'
qa(dirPath, ...)

Arguments

dirPath

A character vector or other object (e.g., SolexaPath; see showMethods, below) locating the data for which quality assessment is to be performed. See help pages for defined methods (by evaluating the example code, below) for details of available methods.

pattern

A character vector limiting the files in dirPath to be processed, as with list.files. Care should be taken to specify pattern to avoid reading unintended files.

type

The type of file being parsed; must be a character vector of length 1, selected from one of the types enumerated in the parameter.

...

Additional arguments used by methods.

sample=TRUE:

Logical(1) indicating whether QA should be performed on a sample (default size 1000000) drawn from each FASTQ file, or from the entire file.

n:

The number of reads to sample when processing FASTQ files.

Lpattern, Rpattern:

A character vector or XString object to be matched to the left end of a sequence. If either Lpattern or Rpattern are provided, trimLRPatterns is invoked to produce a measure of adapter contamination. Mismatch rates are 0.1 on the left and 0.2 on the right, with a minimum overlap of 10 nt.

BPPARAM:

How parallel evalutation will be performed. see BiocParallelParam; the default is BiocParallel::registered()[1].

Details

The most common use of this function provides a directory path and pattern identifying FASTQ files for quality assessment. The default is then to create a quality assessment report based on a random sample of n=1000000 reads from each file.

The following methods are defined, in addition to those on S4 formal classes documented elsewhere:

qa,character-method

Quality assessment is performed on all files in directory dirPath whose file name matches pattern. The type of analysis performed is based on the type argument. Use SolexaExport when all files matching pattern are Solexa _export.txt files. Use SolexaRealign for Solexa _realign.txt files. Use Bowtie for Bowtie files. Use MAQMapShort for MAQ map files produced by MAQ versions below 0.70 and MAQMap for more recent output. Use fastq for collections of fastq-format files. Quality assessment details vary depending on data source.

qa,list-method

dirPath is a list of objects, all of the same class and typically derived from ShortReadQ, on which quality assessment is performed. All elements of the list must have names, and these should be unique.

Value

An object derived from class .QA. Values contained in this object are meant for use by report

Author(s)

Martin Morgan <mtmorgan@fhcrc.org>

See Also

.QA, SolexaExportQA MAQMapQA FastqQA

Examples

1
2
3
4
5
6
7
dirPath <- system.file(package="ShortRead", "extdata", "E-MTAB-1147")
## sample 1M reads / file
qa <- qa(dirPath, "fastq.gz", BPPARAM=SerialParam())
if (interactive())
    browseURL(report(qa))

showMethods("qa", where=getNamespace("ShortRead"))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.