Description Usage Arguments Details Value Author(s) See Also Examples
Finds positions of specified sequence patterns in a list of sequences of the same length ordered by a provided index. Sequence patterns can be consensus sequences of variable length and can contain IUPAC ambiguity code. Position of each pattern occurrence is specified in two-dimensional matrix, i.e. the first coordinate provides the ordinal number of the sequence and the second coordinate gives the position within the sequence where the pattern occurs.
1 2 |
regionsSeq |
A |
patterns |
Character vector specifying one or more DNA sequence patterns (oligonucleotides). IUPAC ambiguity codes can be used and will match any letter in the subject that is associated with the code. |
seqOrder |
Integer vector specifying the order of the provided input sequences.
Must have the same length as the number of sequences in the
|
useMulticore |
Logical, should multicore be used. |
nrCores |
Number of cores to use when |
This function uses the matchPattern
function to find
occurrences of given sequence patterns in a set of input sequences.
Input sequences must all be of the same length and are ordered according to
the index provided in the seqOrder
argument, creating a n * m
matrix, where n
is the number of sequences and m
is the length
of the sequences. Positions of pattern matches in the resulting matrix are
returned as two-dimensional coordinates.
The function returns a named list with one element for each sequence pattern
specified in the patterns
argument. Each element of the list is a
data.frame
with positions of the corresponding pattern in the set of
input sequences. The input sequences of the same length are sorted according
to the index in seqOrder
argument and the positions of pattern
matches in the resulting n * m
matrix (where n
is the number
of sequences and m
is the length of the sequence) are provided. The
sequence
column in the data.frame provides the ordinal number of the
sequence in the ordered list of sequences and the position
column
provides the start position of the pattern match within that sequence.
Vanja Haberle
plotPatternDensityMap
motifScanHits
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | library(GenomicRanges)
load(system.file("data", "zebrafishPromoters.RData", package="seqPattern"))
promoterWidth <- elementMetadata(zebrafishPromoters)$interquantileWidth
# dinucleotide patterns
patternsOccurrence <- getPatternOccurrenceList(regionsSeq = zebrafishPromoters,
patterns = c("TA", "GC"), seqOrder = order(promoterWidth))
names(patternsOccurrence)
head(patternsOccurrence[["GC"]])
# motif consensus sequence
patternsOccurrence <- getPatternOccurrenceList(regionsSeq = zebrafishPromoters,
patterns = "TATAWAWR", seqOrder = order(promoterWidth))
names(patternsOccurrence)
head(patternsOccurrence[["TATAWAWR"]])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.