daudin | R Documentation |
Calculates the exact p-value in the identically and independently distributed of a given local score, a sequence length that 'must not be too large' and for a given score distribution
daudin(
local_score,
sequence_length,
score_probabilities,
sequence_min = NULL,
sequence_max = NULL,
score_values = NULL
)
local_score |
the observed local score |
sequence_length |
length of the sequence |
score_probabilities |
the probabilities for each score from lowest to greatest (Optionnaly with scores as names) |
sequence_min |
minimum score (optional if |
sequence_max |
maximum score (optional if |
score_values |
vector of integer score values, associated to score_probabilities (optional if
|
Either sequence_min
and sequence_max
are specified as input, OR all possible score values in
score_values
vector ; one of this choice is required. <cr>
Small in this context depends heavily on your machine.
On a 3,7GHZ machine this means for daudin(1000, 5000, c(0.2, 0.2, 0.2, 0.1, 0.2, 0.1), -2, 3)
an execution time of ~2 seconds. This is due to the calculation method using matrix exponentiation
which takes times. The size of the matrix of the exponentiation is equal to a+1 with a the
local score value. The matrix must be put at the power n, with n the sequence length.
Moreover, it is known that the local score value is expected to be in mean of order log(n).
A double representing the probability of a local score as high as the one given as argument.
p1 <- daudin(local_score = 4, sequence_length = 50,
score_probabilities = c(0.2, 0.3, 0.1, 0.2, 0.1, 0.1),
sequence_min = -3, sequence_max = 2)
p2 <- daudin(local_score = 4, sequence_length = 50,
score_probabilities = c(0.2, 0.3, 0.1, 0.2, 0.1, 0.1),
score_values = as.integer(-3:2))
p1 == p2 # TRUE
prob <- c(0.08, 0.32, 0.08, 0.00, 0.08, 0.00, 0.00, 0.08, 0.02, 0.32, 0.02)
score_values <- which(prob != 0) - 6 # keep only non null probability scores
prob0 <- prob[prob != 0] # and associated probability
p <- daudin(150, 10000, prob, sequence_min = -5, sequence_max = 5)
p0 <- daudin(150, 10000, prob0, score_values = score_values)
names(prob0) <- score_values
p1 <- daudin(150, 10000, prob0)
p == p0 # TRUE
p == p1 # TRUE
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.