filterReads: Filter out reads without adaptors

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Filter reads to only retain those with the essential adaptors.

Usage

1
filterReads(aligned, score1, score2, essential1=TRUE, essential2=TRUE)

Arguments

aligned

A list of adaptor alignment information and read sequences, identical to the output of adaptorAlign.

score1, score2

Numeric scalars specifying the minimum alignment score required for positive match of adaptor 1 or 2 to the read sequence.

essential1, essential2

Logical scalars indicating whether the presence of adaptor 1 or 2 is essential, i.e., the read should be discarded if this adaptor is not found.

Details

Identification of a matching adaptor sequence is based on the alignment scores exceeding a certain threshold, i.e., score1 and score2 for adaptors 1 and 2 respectively. An appropriate threshold for each adaptor can be chosen with methods like getAdaptorThresholds.

If essential1=TRUE, a read will be discarded if a positive match to adaptor 1 cannot be found. This is useful when the adaptor contains critical information such as the unique molecular identifier. However, if adaptor 1 is not important (e.g., only necessary for PCR amplification), then it does not matter that it cannot be identified. In such cases, setting essential1=FALSE will ensure that reads without a match to adaptor 1 are not discarded. The same logic applies to essential2 for adaptor 2.

This function will also report the start and end of the read sequence between the adaptors, marking the interval that would remain after adaptor trimming. These positions are reported relative to the canonical orientation, see ?adaptorAlign for more details. The interval may include the start or end of the read if adaptor 1 or 2, respectively, is not essential and not found. Reads will also be discarded if adaptor 1 ends after adaptor 2 begins, as this implies that the adaptors overlap (and that there is no sequence in between).

Value

aligned is returned, subsetted to only retain rows (i.e., reads) that have all essential adaptors. It also contains the additional trim.start and trim.end fields, indicating the start of the sequence between the two adaptors.

Author(s)

Florian Bieberich, Aaron Lun

See Also

adaptorAlign for the input into this function.

Examples

1
2
3
4
5
6
7
example(adaptorAlign)

(filt <- filterReads(out, score1=5, score2=5)) # both identified

filterReads(out, score1=100, score2=5) # adaptor 1 not identified

filterReads(out, score1=100, score2=5, essential1=FALSE) # ... but that's okay.

MarioniLab/sarlacc documentation built on May 13, 2019, 12:51 p.m.