Description Usage Arguments Details Value Author(s) References See Also Examples
For calculating pvalues/logodds scores for any number of motifs.
1 2 3  motif_pvalue(motifs, score, pvalue, bkg.probs, use.freq = 1, k = 8,
nthreads = 1, rand.tries = 10, rng.seed = sample.int(10000, 1),
allow.nonfinite = FALSE)

motifs 
See 
score 

pvalue 

bkg.probs 

use.freq 

k 

nthreads 

rand.tries 

rng.seed 

allow.nonfinite 

Calculating pvalues for motifs can be very computationally intensive. This is due to how pvalues must be calculated: for a given score, all possible sequences which score equal or higher must be found, and the probability for each of these sequences (based on background probabilities) summed. For a DNA motif of length 10, the number of possible unique sequences is 4^10 = 1,048,576. Finding all possible sequences higher than a given score can be done very efficiently and quickly with a branchandbound algorithm, but as the motif length increases even this calculation becomes impractical. To get around this, the pvalue calculation can be approximated.
In order to calculate pvalues for longer motifs, this function uses the
approximation proposed by \insertCitepvalues;textualuniversalmotif, where
the motif is subset, pvalues calculated for the subsets, and finally
combined for a total pvalue. The smaller the size of the subsets, the
faster the calculation; but also, the bigger the approximation. This can be
controlled by setting k
. In fact, for smaller motifs (< 13 positions)
calculating exact pvalues can be done individually in reasonable time by
setting k = 12
.
To calculate a score from a Pvalue, all possible scores are calculated
and the (1  pvalue) * 100
nth percentile score returned.
When k < ncol(motif)
, the complete set of scores is instead approximated
by randomly adding up all possible scores from each subset.
It is important to keep in mind that no consideration is given to
background frequencies in the score calculator. Note that this approximation
can actually be potentially quite expensive at times and even slower than
the exact version; for jobs requiring lots of repeat calculations, a bit of
benchmarking beforehand can be useful to find the optimal settings.
To get an idea as to how the score calculator works (without approximation), try the following code with your motif (be careful with longer motifs):
quantile(get_scores(motif), probs = 0.99)
numeric
A vector of scores/pvalues.
Benjamin JeanMarie Tremblay, b2tremblay@uwaterloo.ca
pvaluesuniversalmotif
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  if (R.Version()$arch != "i386") {
## Pvalue/score calculations are performed using the PWM version of the
## motif
data(examplemotif)
## Get a minimum score based on a pvalue
motif_pvalue(examplemotif, pvalue = 0.001)
## Get the probability of a particular sequence hit
motif_pvalue(examplemotif, score = 0)
## The calculations can be performed for multiple motifs
motif_pvalue(list(examplemotif, examplemotif), pvalue = c(0.001, 0.0001))
## Compare score thresholds and Pvalue:
scores < motif_score(examplemotif, c(0.6, 0.7, 0.8, 0.9))
motif_pvalue(examplemotif, scores)
## Calculate the probability of getting a certain match or better:
TATATAT < score_match(examplemotif, "TATATAT")
TATATAG < score_match(examplemotif, "TATATAG")
motif_pvalue(examplemotif, TATATAT)
motif_pvalue(examplemotif, TATATAG)
## Get all possible matches by Pvalue:
get_matches(examplemotif, motif_pvalue(examplemotif, pvalue = 0.0001))
}

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.