findMotifHits | R Documentation |
findMotifHits
scans sequences (either provided
as a file, an R object or genomic coordinates) for matches to
positional weight matrices (provided as a file or as R objects)
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'character,character'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'character,DNAString'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'character,DNAStringSet'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'PWMatrix,character'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'PWMatrix,DNAString'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'PWMatrix,DNAStringSet'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'PWMatrixList,character'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'PWMatrixList,DNAString'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'PWMatrixList,DNAStringSet'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'PWMatrix,GRanges'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
## S4 method for signature 'PWMatrixList,GRanges'
findMotifHits(
query,
subject,
min.score,
method = c("matchPWM", "homer2"),
homerfile = findHomer("homer2"),
BPPARAM = SerialParam(),
genome = NULL
)
query |
The motifs to search for, either a
|
subject |
The sequences to be searched, either a
|
min.score |
The minimum score for counting a match. Can be given as a character string containing a percentage (e.g. "85 highest possible score or as a single number. |
method |
The internal method to use for motif searching. One of
Please note that the two methods might give slightly different results (see details). |
homerfile |
Path and file name of the |
BPPARAM |
An optional |
genome |
|
The implemented methods (matchPWM
and homer2
) are
there for convenience (method="matchPWM"
calls
Biostrings::matchPWM
internally in an optimized fashion, and
method = "homer2"
calls the command line tool from Homer and
therefore requires an installation of Homer).
In general, running findMotifHits
with the same parameters using
any of the methods generates identical results. Some minor differences
could occur that result from rounding errors during the necessary
conversion of PWMs (log2-odd scores) to the probability matrices needed
by Homer, and the conversion of scores from and to the natural log scale
used by Homer. These conversions are implemented transparently for the
user, so that the arguments of findMotifHits
do not have to be
adjusted (e.g. the PWMs should always contain log2-odd scores, and
min.score
is always on the log2 scale).
If there are bases with frequencies of less than 0.001 in a motif, Homer
will set them to 0.001 and adjust the other frequencies at that motif
position accordingly so that they sum to 1.0. This may differ from the
adjustment used when scanning a PWM with matchPWM
(e.g. the
pseudocounts
argument in the toPWM
function), and thus can give rise to differences in reported motif hits
and hit scores (typically only low-scoring hits).
A GRanges
object with the matches to query
in
subject
.
seqs <- Biostrings::DNAStringSet(c(s1 = "GTCAGTCGATC", s2 = "CAGTCTAGCTG",
s3 = "CGATCGTCAGT", s4 = "AGCTGCAGTCT"))
m <- rbind(A = c(2, 0, 0),
C = c(1, 1, 0),
G = c(0, 2, 0),
T = c(0, 0, 3))
pwms <- TFBSTools::PWMatrixList(
TFBSTools::PWMatrix(ID = "m1", profileMatrix = m),
TFBSTools::PWMatrix(ID = "m2", profileMatrix = m[, 3:1])
)
findMotifHits(pwms, seqs, min.score = 7)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.