removeLinker: Remove linker sequences located at the start of short reads

Description Usage Arguments Details Value Author(s) See Also Examples

Description

If linkers are attached during sample preparation, it may be useful to remove the linkers' sequences after sequencing. This method finds and removes linker sequences that are located at the start of the given reads.

Usage

1
2
## S4 method for signature 'XStringSet,DNAString,logical,numeric,numeric'
removeLinker(reads, linker, removeReadsWithoutLinker, minOverlap, penalty)

Arguments

reads

A DNAStringSet instance that contains reads possibly having linkers at their start site

linker

A DNAString instance with the linker's sequence

removeReadsWithoutLinker

Whether reads without linkers should be removed. Default is FALSE

minOverlap

The minimal score that must be achived when aligning the linker. Default is length(linker)/2

penalty

The penalty for substitutions or indels. Default is 2

Details

The best alignment of the linker within the start (length of linker + 5) of each given sequence is computed. The followong scoring schema is used: Each matching bases scores +1. Each substitution or indel scores the given penalty argument (default: penalty=2). There are no penalties for gaps and the end of the linker (overlap). An alignment is considered as match, if the scores is larger of equal to minOverlap (default: minOverlap=round(length(linker)/2)). In cases of a successful match, the subsequence from position 1 until the end of the linker's alignment is removed.

Value

removeLinker returns a DNAStringSet with trimmed reads.

Author(s)

Hans-Ulrich Klein

See Also

sequenceCaptureLinkers, DNAStringSet, pairwiseAlignment

Examples

1
2
3
4
5
6
    linker = sequenceCaptureLinkers()[[1]]
    reads = DNAStringSet(c(
        "CTCGAGAATTCTGGATCCTCAAA",
             "GAATTCTGGATCCTCAAA",
        "CTCGAGAAAAAAAAATCCTCAAA"))
    removeLinker(reads, linker)

R453Plus1Toolbox documentation built on Nov. 8, 2020, 5:59 p.m.