Scoring | R Documentation |
score_degen
Determines the degeneration score of a sequence.
score_conservation
Determines the sequence conservation scores of a set of templates using Shannon entropy.
score_primers
Computes scores for a set of primers based on the deviations of the primers from the constraints.
score_conservation(template.df, gap.char = "-", win.len = 30, by.group = TRUE)
score_degen(seq, gap.char = "-")
score_primers(
primer.df,
settings,
active.constraints = names(constraints(settings)),
alpha = 0.5
)
template.df |
A |
gap.char |
The gap character in the sequences. The default is "-". |
win.len |
The size of a window for evaluating conservation. The default window size is set to 30. |
by.group |
Whether the determination of binding regions
should be stratified according to the groups defined in |
seq |
A list of vectors containing individual characters of a nucleotide sequence. |
primer.df |
A |
settings |
A |
active.constraints |
A character vector of constraint identifiers that are considered for scoring the primers. |
alpha |
A numeric that is used to determine the trade-off
between the impact of the maximal observed deviation and the total
deviation. At its default |
score_degen
computes the degeneration of an ambiguous sequence
by considering the number of unambiguous sequences that
are represented by the the ambiguous sequence.
Let a sequence S
of length n
be represented by a collection of sets such that
S = {s_1, s_2, \ldots, s_n}
where s_i
indicates the set of unambiguous bases found
at position i
of the primer. Then the degeneracy D
of a primer
can be defined as
D = \prod_i{|s_i|}
where |s_i|
provides the number of disambiguated bases at position i
.
score_primers
determines the penalty of a primer in the following way.
Let d
be a vector indicating the absolute deviations from
individual constraints and let p
be the scalar penalty that
is assigned to a primer. We define
p = \alpha \cdot \max_i d_i + \sum_i (1 - \alpha) \cdot d_i
such that for large values of alpha
the maximal deviation
dominates giving rise to a local penalty (reflecting the largest
absolute deviation) and for small alpha
the total deviation
dominates giving rise to a global penalty
(reflecting the sum of constraint deviations).
When alpha
is 1 only the most extreme absolute deviation is
considered and when alpha
is 0 the sum of all absolute
deviations is computed.
A list containing Entropies
and Alignments
.
Entropies
is a data frame with conservation scores.
Each column indicates a position in the alignment of template sequences
and each row gives the entropies of the sequences
belonging to a specific group of template sequences.
Alignments
is a list of DNABin
objects, where each
object gives the alignment corresponding to one group of template sequences.
score_degen
finds the number of unambiguous sequences
that are represented by seq
.
score_primers
returns a data frame containing
scores for individual primers.
score_conservation
requires the MAFFT software
for multiple alignments (http://mafft.cbrc.jp/alignment/software/).
## Not run:
data(Ippolito)
entropy.data <- score_conservation(template.df, gap.char = "-", win.len = 18, by.group = TRUE)
## End(Not run)
# Compute degeneration for sequences with differing number of ambiguous bases
seq <- strsplit(c("ctggaattacggtacc", "taggaaccggrtaagc", "rtaaasrygtar"), split = "")
degen <- score_degen(seq)
# Score the primers
data(Ippolito)
primer.scores <- score_primers(primer.df, settings)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.