Description Usage Arguments Details Value See Also Examples
The function produces a base-pair resolution matrix or matrices of scores that correspond to k-mer or PWM matrix occurrence over predefined windows that have equal width. It finds either positions of pattern hits above a specified threshold and creates score matrix filled with 1 (presence of pattern) and 0 (its absence) or matrix with scores themselves. If pattern is a character of length 1 or PWM matrix then the function returns a ScoreMatrix object, if character of length more than 1 or list of PWMs then ScoreMatrixList.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | patternMatrix(pattern, windows, genome = NULL, min.score = 0.8,
asPercentage = FALSE, cores = 1)
\S4method{patternMatrix}{character,DNAStringSet}(pattern, windows,
asPercentage, cores)
\S4method{patternMatrix}{character,GRanges,BSgenome}(pattern, windows, genome,
cores)
\S4method{patternMatrix}{matrix,DNAStringSet}(pattern, windows,
min.score, asPercentage,
cores)
\S4method{patternMatrix}{matrix,GRanges,BSgenome}(pattern, windows, genome,
min.score, asPercentage,
cores)
\S4method{patternMatrix}{list,DNAStringSet}(pattern, windows,
min.score, asPercentage,
cores)
\S4method{patternMatrix}{list,GRanges,BSgenome}(pattern, windows, genome,
min.score, asPercentage,
cores)
|
pattern |
matrix (a PWM matrix), list of matrices or a character vector of length 1 or more. A matrix is a PWM matrix that needs to have one row for each nucleotide ("A","C","G" and "T" respectively). IUPAC ambiguity codes can be used and it will match any letter in the subject that is associated with the code. |
windows |
|
genome |
|
min.score |
numeric or character indicating minimum score to count a match.
It can be given as a character string containing
a percentage of the highest possible score or a single number
(by default "80%" or 0.8). If min.score is set to NULL
then |
asPercentage |
boolean telling whether scores represent percentage of the maximal motif PWM score (default: TRUE) or raw scores (FALSE). |
cores |
the number of cores to use (default: 1). It is supported only on Unix-like platforms. |
patternMatrix
is based on functions from the seqPattern package:
getPatternOccurrenceList function to find position of pattern that is a character vector in
a list of sequences (a DNAStringSet object)
and adapted function motifScanHits to find pattern that is a PWM matrix
in sequences (a DNAStringSet object).
If cores > 1 is provided then for every window occurrence of pattern is counted in paralallel.
returns a scoreMatrix
object or a scoreMatrixList
object
1 2 3 4 5 6 7 8 9 10 | library(Biostrings)
# consensus sequence of the ctcf motif
motif = "CCGCGNGGNGGCAG"
# Creates 10 random DNA sequences
seqs = sapply(1:10,
function(x) paste(sample(c("A","T","G","C"), 180, replace=TRUE), collapse=""))
windows = DNAStringSet(seqs)
p = patternMatrix(pattern=motif, windows=windows, min.score=0.8)
p
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.