Scoring: Scoring Functions.
In matdoering/openPrimeR: Multiplex PCR Primer Design and Analysis

Scoring

R Documentation

Scoring Functions.

Description

score_degen: Determines the degeneration score of a sequence.
score_conservation: Determines the sequence conservation scores of a set of templates using Shannon entropy.
score_primers: Computes scores for a set of primers based on the deviations of the primers from the constraints.

Usage

score_conservation(template.df, gap.char = "-", win.len = 30, by.group = TRUE)

score_degen(seq, gap.char = "-")

score_primers(
  primer.df,
  settings,
  active.constraints = names(constraints(settings)),
  alpha = 0.5
)

Arguments

`template.df`	A `Templates` object providing the set of templates.
`gap.char`	The gap character in the sequences. The default is "-".
`win.len`	The size of a window for evaluating conservation. The default window size is set to 30.
`by.group`	Whether the determination of binding regions should be stratified according to the groups defined in `template.df`. The default is `TRUE`.
`seq`	A list of vectors containing individual characters of a nucleotide sequence.
`primer.df`	A `Primers` object containing the primers.
`settings`	A `DesignSettings` object containing the analysis settings.
`active.constraints`	A character vector of constraint identifiers that are considered for scoring the primers.
`alpha`	A numeric that is used to determine the trade-off between the impact of the maximal observed deviation and the total deviation. At its default `alpha` is set to 0.5 such that the maximal deviation and the total deviation have an equal weight when computing the penalties.

Details

score_degen computes the degeneration of an ambiguous sequence by considering the number of unambiguous sequences that are represented by the the ambiguous sequence. Let a sequence S of length n be represented by a collection of sets such that

S = {s_1, s_2, \ldots, s_n}

where s_i indicates the set of unambiguous bases found at position i of the primer. Then the degeneracy D of a primer can be defined as

D = \prod_i{|s_i|}

where |s_i| provides the number of disambiguated bases at position i.

score_primers determines the penalty of a primer in the following way. Let d be a vector indicating the absolute deviations from individual constraints and let p be the scalar penalty that is assigned to a primer. We define

p = \alpha \cdot \max_i d_i + \sum_i (1 - \alpha) \cdot d_i

such that for large values of alpha the maximal deviation dominates giving rise to a local penalty (reflecting the largest absolute deviation) and for small alpha the total deviation dominates giving rise to a global penalty (reflecting the sum of constraint deviations). When alpha is 1 only the most extreme absolute deviation is considered and when alpha is 0 the sum of all absolute deviations is computed.

Value

A list containing Entropies and Alignments. Entropies is a data frame with conservation scores. Each column indicates a position in the alignment of template sequences and each row gives the entropies of the sequences belonging to a specific group of template sequences. Alignments is a list of DNABin objects, where each object gives the alignment corresponding to one group of template sequences.

score_degen finds the number of unambiguous sequences that are represented by seq.

score_primers returns a data frame containing scores for individual primers.

Note

score_conservation requires the MAFFT software for multiple alignments (http://mafft.cbrc.jp/alignment/software/).

Examples

## Not run: 
data(Ippolito)
entropy.data <- score_conservation(template.df, gap.char = "-", win.len = 18, by.group = TRUE)

## End(Not run)
# Compute degeneration for sequences with differing number of ambiguous bases
seq <- strsplit(c("ctggaattacggtacc", "taggaaccggrtaagc", "rtaaasrygtar"), split = "")
degen <- score_degen(seq)

# Score the primers
data(Ippolito)
primer.scores <- score_primers(primer.df, settings)

matdoering/openPrimeR documentation built on July 4, 2025, 3:59 a.m.