extractQuality: Obtain read qualities from a Fastq file or ShortReadQ object

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/nucSim.R

Description

Converts the read qualities encoded in fastq formatted files into error probabilities.

Usage

1
2
extractQuality(reads, minLength = 25, dir, 
	type = c("Illumina", "Sanger", "Solexa"))

Arguments

reads

Either the name of a fastq file or a ShortReadQ object (see Details).

minLength

Minimum read length required.

dir

Directory of fastq file.

type

Character string indicating the format the qualities are encoded in (see Details).

Details

If reads and dir are character strings it is assumed that ‘dir/reads’ is the name of a fastq file. Otherwise reads should be a ShortReadQ object in which case dir is ignored.

Currently three different encodings of read qualities are supported. The encoding has to be selected via the type argument. The supported formats are

Illumina

The format currently used by Illumina (version 1.3). This is a phred score between 0 and 40 encoded as ASCII characters 64 to 104. [default]

Sanger

The Sanger format uses a phred quality score between 0 and 93 encoded as ASCII characters 33 to 126.

Solexa

The old Solexa format previously used by Solexa/Illumina uses a quality score between -5 and 40 encoded as ASCII characters 59 to 104.

Value

A list with a vector of error probabilities for each read in reads that is at least minLength nucleotides long.

Author(s)

Peter Humburg

See Also

decodeQuality, readQualitySample

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
	## load reads from a fastq file with Sanger encoding
	qualities <- extractQuality("test.fastq", dir=".", type="Sanger")
	
	## extract error probabilities for first 25bp of each read
	qualities25 <- sapply(qualities, "[", 1:25)
	
	## plot average quality for each position
	plot(rowMeans(qualities25), type='b', xlab="Read position", 
		ylab="Error probability")

## End(Not run)  

ChIPsim documentation built on Nov. 8, 2020, 8:09 p.m.