Score all potential binding sites in an MS object.
If a PWM has N rows, then score every observed N-mer in the MS object.
The score is given by the log likelihood of the N-mer given the PWM, minus the
log likelihood of the N-mer under the Markov model specified by mm.
By default, only potential binding sites with scores > 0 are returned, but
this can be modified with the `threshold`

argument.

1 2 |

`ms` |
MS object containing at least one sequence |

`pwm` |
Position Weight matrix representing transcription factor motif |

`mm` |
Markov Model associated with given sequences, which represents the null model |

`conservative` |
(Logical value) If TRUE, sequences containing N's are given a log likelihood of negative infinity under the PWM model. If FALSE, any 'N' encountered does not contributes to the score. |

`threshold` |
(Numeric value) Only sites with scores above this threshold are returned (default = 0) |

`strand` |
One of "best", "both", "+", or "-" specifying which strand(s) to return results for. If "both" search for binding sites in both directions, return all results found. If "best" search for binding sites in both directions, but for each N-mer, return the maximum score over either strand. If "+" look only on the forward strand, and if "-" look only on the reverse strand. |

`return_posteriors` |
If TRUE, will return a list structure. Scores represent the motif 'match score', or the product of the probability of observing each base under the motif or background models. Scores are returned under the motif model for all positions in the sequence, on both forward and reverse strands, and under the background model. Note that strand and threshold options are both ignored. If FALSE, returns scores and locations for possible binding sites as a feature object. |

Scores and locations for possible binding sites returned as a feature object. Optionally, if return_posteriors is TRUE, will return a list structure (see above).

If a PWM file contains multiple PWMs, then read.pwm will return a list of PWMs. This function takes a single PWM.

`read.ms split.ms groupByGC.ms build.mm read.pwm`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ```
require("rtfbs")
exampleArchive <- system.file("extdata", "NRSF.zip", package="rtfbs")
seqFile <- "input.fas"
unzip(exampleArchive, seqFile)
# Read in FASTA file "input.fas" from the examples into an
# MS (multiple sequences) object
ms <- read.ms(seqFile);
pwmFile <- "pwm.meme"
unzip(exampleArchive, pwmFile)
# Read in Position Weight Matrix (PWM) from MEME file from
# the examples into a Matrix object
pwm <- read.pwm(pwmFile)
# Build a 3rd order Markov Model to represent the sequences
# in the MS object "ms". The Model will be a list of
# matrices corrisponding in size to the order of the
# Markov Model
mm <- build.mm(ms, 3);
# Match the PWM against the sequences provided to find
# possible transcription factor binding sites. A
# Features object is returned, containing the location
# of each possible binding site and an associated score.
# Sites with a negative score are not returned unless
# we set threshold=-Inf as a parameter.
score.ms(ms, pwm, mm)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.