dustyScore | R Documentation |
dustyScore
identifies low-complexity sequences, in a manner
inspired by the dust
implementation in BLAST
.
dustyScore(x, batchSize=NA, ...)
x |
A |
batchSize |
|
... |
Additional arguments, not currently used. |
The following methods are defined:
signature(x = "DNAStringSet")
: operating on
an object derived from class DNAStringSet
.
signature(x = "ShortRead")
: operating on
the sread
of an object derived from class
ShortRead
.
The dust-like calculations used here are as implemented at https://stat.ethz.ch/pipermail/bioc-sig-sequencing/2009-February/000170.html. Scores range from 0 (all triplets unique) to the square of the width of the longest sequence (poly-A, -C, -G, or -T).
The batchSize
argument can be used to reduce the memory
requirements of the algorithm by processing the x
argument in
batches of the specified size. Smaller batch sizes use less memory,
but are computationally less efficient.
A vector of numeric scores, with length equal to the length of
x
.
Herve Pages (code); Martin Morgan
Morgulis, Getz, Schaffer and Agarwala, 2006. WindowMasker: window-based masker for sequenced genomes, Bioinformatics 22: 134-141.
The WindowMasker supplement defining dust
ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/windowmasker/windowmasker_suppl.pdf
sp <- SolexaPath(system.file('extdata', package='ShortRead'))
rfq <- readFastq(analysisPath(sp), pattern="s_1_sequence.txt")
range(dustyScore(rfq))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.