pairwiseAlignSeqs: Align a short pattern to variable length target sequences.

Description Usage Arguments Value Note See Also Examples

View source: R/hiReadsProcessor.R

Description

Align a fixed length short pattern sequence (i.e. primers or adaptors) to subject sequences using pairwiseAlignment. This function uses default of type="overlap", gapOpening=-1, and gapExtension=-1 to align the patternSeq against subjectSeqs. One can adjust these parameters if prefered, but not recommended. This function is meant for aligning a short pattern onto large collection of subjects. If you are looking to align a vector sequence to subjects, then please use BLAT or see one of following blatSeqs, findAndRemoveVector

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
pairwiseAlignSeqs(
  subjectSeqs = NULL,
  patternSeq = NULL,
  side = "left",
  qualityThreshold = 1,
  showStats = FALSE,
  bufferBases = 5,
  doRC = TRUE,
  returnUnmatched = FALSE,
  returnLowScored = FALSE,
  parallel = FALSE,
  ...
)

Arguments

subjectSeqs

DNAStringSet object containing sequences to be searched for the pattern. This is generally bigger than patternSeq, and cases where subjectSeqs is smaller than patternSeq will be ignored in the alignment.

patternSeq

DNAString object or a sequence containing the query sequence to search. This is generally smaller than subjectSeqs.

side

which side of the sequence to perform the search: left, right or middle. Default is 'left'.

qualityThreshold

percent of patternSeq to match. Default is 1, full match.

showStats

toggle output of search statistics. Default is FALSE.

bufferBases

use x number of bases in addition to patternSeq length to perform the search. Beneficial in cases where the pattern has homopolymers or indels compared to the subject. Default is 5. Doesn't apply when side='middle'.

doRC

perform reverse complement search of the defined pattern. Default is TRUE.

returnUnmatched

return sequences which had no or less than 5% match to the patternSeq. Default is FALSE.

returnLowScored

return sequences which had quality score less than the defined qualityThreshold. Default is FALSE.

parallel

use parallel backend to perform calculation with BiocParallel. Defaults to FALSE. If no parallel backend is registered, then a serial version is ran using SerialParam.

...

extra parameters for pairwiseAlignment

Value

Note

See Also

primerIDAlignSeqs, vpairwiseAlignSeqs, doRCtest, findAndTrimSeq, blatSeqs, findAndRemoveVector

Examples

1
2
3
4
5
6
7
subjectSeqs <- c("CCTGAATCCTGGCAATGTCATCATC", "ATCCTGGCAATGTCATCATCAATGG", 
"ATCAGTTGTCAACGGCTAATACGCG", "ATCAATGGCGATTGCCGCGTCTGCA", 
"CCGCGTCTGCAATGTGAGGGCCTAA", "GAAGGATGCCAGTTGAAGTTCACAC")
subjectSeqs <- DNAStringSet(xscat("AAAAAAAAAA", subjectSeqs))
pairwiseAlignSeqs(subjectSeqs, "AAAAAAAAAA", showStats=TRUE)
pairwiseAlignSeqs(subjectSeqs, "AAATAATAAA", showStats=TRUE, 
qualityThreshold=0.5)

malnirav/hiReadsProcessor documentation built on July 29, 2021, 6:33 a.m.