primerIDAlignSeqs: Align a short pattern with PrimerID to variable length target...

Description Usage Arguments Value Note See Also Examples

View source: R/hiReadsProcessor.R

Description

Align a fixed length short pattern sequence containing primerID to variable length subject sequences using pairwiseAlignment. This function uses default of type="overlap", gapOpening=-1, and gapExtension=-1 to align the patterSeq against subjectSeqs. The search is broken up into as many pieces +1 as there are primerID and then compared against subjectSeqs. For example, patternSeq="AGCATCAGCANNNNNNNNNACGATCTACGCC" will launch two search jobs one per either side of Ns. For each search, qualityThreshold is used to filter out candidate alignments and the area in between is chosen to be the primerID. This strategy is benefical because of Indels introduced through homopolymer errors. Most likely the length of primerID(s) wont the same as you expected!

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
primerIDAlignSeqs(
  subjectSeqs = NULL,
  patternSeq = NULL,
  qualityThreshold1 = 0.75,
  qualityThreshold2 = 0.5,
  doAnchored = FALSE,
  doRC = TRUE,
  returnUnmatched = FALSE,
  returnRejected = FALSE,
  showStats = FALSE,
  ...
)

Arguments

subjectSeqs

DNAStringSet object containing sequences to be searched for the pattern.

patternSeq

DNAString object or a sequence containing the query sequence to search with the primerID.

qualityThreshold1

percent of first part of patternSeq to match. Default is 0.75.

qualityThreshold2

percent of second part of patternSeq to match. Default is 0.50.

doAnchored

for primerID based patternSeq, use the base before and after primer ID in patternSeq as anchors?. Default is FALSE.

doRC

perform reverse complement search of the defined pattern. Default is TRUE.

returnUnmatched

return sequences if it had no or less than 5% match to the first part of patternSeq before the primerID. Default is FALSE.

returnRejected

return sequences if it only has a match to one side of patternSeq or primerID length does not match # of Ns +/-2 in the pattern. Default is FALSE.

showStats

toggle output of search statistics. Default is FALSE.

...

extra parameters for pairwiseAlignment

Value

Note

See Also

vpairwiseAlignSeqs, pairwiseAlignSeqs, doRCtest, blatSeqs, findAndRemoveVector

Examples

1
2
3
4
5
6
7
8
subjectSeqs <- c("CCTGAATCCTGGCAATGTCATCATC", "ATCCTGGCAATGTCATCATCAATGG", 
"ATCAGTTGTCAACGGCTAATACGCG", "ATCAATGGCGATTGCCGCGTCTGCA", 
"CCGCGTCTGCAATGTGAGGGCCTAA", "GAAGGATGCCAGTTGAAGTTCACAC")
ids <- c("GGTTCTACGT", "AGGAGTATGA", "TGTCGGTATA", "GTTATAAAAC", 
"AGGCTATATC", "ATGGTTTGTT")
subjectSeqs <- xscat(subjectSeqs, xscat("AAGCGGAGCCC",ids,"TTTTTTTTTTT"))
patternSeq <- "AAGCGGAGCCCNNNNNNNNNNTTTTTTTTTTT"
primerIDAlignSeqs(DNAStringSet(subjectSeqs), patternSeq, doAnchored = TRUE)

malnirav/hiReadsProcessor documentation built on July 29, 2021, 6:33 a.m.