designOligos: Design primers and probes

View source: R/designOligos.R

designOligosR Documentation

Design primers and probes

Description

designOligos() designs oligos (primers and probes) from a consensus profile.

Usage

designOligos(
  x,
  maxGapFrequency = 0.01,
  lengthPrimer = c(18, 22),
  maxDegeneracyPrimer = 4,
  gcClampPrimer = TRUE,
  avoidThreeEndRunsPrimer = TRUE,
  gcPrimer = c(0.4, 0.65),
  tmPrimer = c(50, 65),
  concPrimer = 500,
  designStrategyPrimer = "ambiguous",
  probe = TRUE,
  lengthProbe = c(18, 22),
  maxDegeneracyProbe = 4,
  avoidFiveEndGProbe = TRUE,
  gcProbe = c(0.4, 0.65),
  tmProbe = c(50, 70),
  concProbe = 250,
  concNa = 0.05
)

Arguments

x

An RprimerProfile object.

maxGapFrequency

Maximum allowed gap frequency at the primer and probe binding sites in the target alignment. A number [0, 1], defaults to 0.01.

lengthPrimer

Primer length range. A numeric vector [15, 30], defaults to c(18, 22).

maxDegeneracyPrimer

Maximum number of variants of each primer. A number [1, 256], defaults to 4.

gcClampPrimer

If primers must have a GC clamp. A GC clamp is identified as two to three G or C:s within the last five bases (3' end) of the primer. TRUE or FALSE, defaults to TRUE.

avoidThreeEndRunsPrimer

If primers with more than two runs of the same nucleotide at the terminal 3' end should be avoided. TRUE or FALSE, defaults to TRUE.

gcPrimer

GC-content range for primers. A numeric vector [0, 1], defaults to c(0.40, 0.65).

tmPrimer

Tm range for primers (in Celcius degrees). A numeric vector [30, 90], defaults to c(55, 65).

concPrimer

Primer concentration in nM, for tm calculation. A number [20, 2000], defaults to 500.

designStrategyPrimer

"ambiguous" or "mixed". Defaults to "ambiguous" (see details below).

probe

If probes should be designed. TRUE or FALSE, defaults to TRUE.

lengthProbe

Probe length range. A numeric vector [15, 40], defaults to c(18, 22).

maxDegeneracyProbe

Maximum number of variants of each probe. A number [1, 256], defaults to 4.

avoidFiveEndGProbe

If probes with G at the 5' end should be avoided. TRUE or FALSE, defaults to TRUE.

gcProbe

GC-content range for probes. A numeric vector [0, 1], defaults to c(0.40, 0.65).

tmProbe

Tm range for probes (in Celcius degrees). A numeric vector [30, 90], defaults to c(55, 70).

concProbe

Primer concentration in nM, for tm calculation. A numeric vector [20, 2000], defaults to 250.

concNa

Sodium ion (equivalent) concentration in the PCR reaction (in M). For calculation of tm and delta G. A numeric vector [0.01, 1], defaults to 0.05 (50 mM).

Details

Valid oligos

For an oligo to be considered as valid, all sequence variants must fulfill all the specified design constraints.

Furthermore, oligos with at least one sequence variant containing more than four consecutive runs of the same nucleotide (e.g. "AAAAA") and/or more than three consecutive runs of the same di-nucleotide (e.g. "TATATATA") will be excluded from consideration.

Calculation of tm and delta G

Melting temperatures are calculated for perfectly matching DNA duplexes using the nearest-neighbor method (SantaLucia and Hicks, 2004), by using the following equation:

\loadmathjax \mjsdeqn

Tm = (\Delta H ^o \cdot 1000) / (\Delta S ^o + R \cdot \log [\mathrmoligo]) - 273.15

where \mjseqn\Delta H ^o is the change in enthalpy (in cal/mol) and \mjseqn\Delta S ^o is the change in entropy (in cal/K/mol) when an oligo and a perfectly matching target sequence goes from random coil to duplex formation. \mjseqnK is the gas constant (1.9872 cal/mol K).

Delta G is calculated at 37 Celcius degrees, for when an oligo and a perfectly matching target sequence goes from random coil to duplex state, by using the following equation:

\mjsdeqn \Delta

G ^o _T = ( \Delta H ^o \cdot 1000 - T \cdot \Delta S ^o ) / 1000ASCII representation For both tm and delta G, the following salt correction method is used for \mjseqn \Delta S^o , as described in SantaLucia and Hicks (2004):

\mjsdeqn \Delta

S^o [\mathrmNa^+] = \Delta S^o [\mathrm1 M NaCl] + 0.368 \cdot N / 2 \cdot \log [\mathrmNa^+]

where \mjseqnN is the total number of phosphates in the duplex, and [Na+] is the total concentration of monovalent cations.

Nearest neighbor table values for \mjseqn\Delta S^o and \mjseqn\Delta H^o are from SantaLucia and Hicks, 2004, and can be retrieved calling rprimer:::lookup$nn.

Primer design strategies

Primers can be generated by using one of the two following strategies:

  • The ambiguous strategy (default) generates primers from the IUPAC consensus sequence alone, which means that ambiguous bases can occur at any position in the primer.

  • The mixed strategy generates primers from both the majority and the IUPAC consensus sequence. These primers consist of a shorter degenerate part at the 3' end (approx. 1/3 of the primer, targeting a conserved region) and a longer consensus part at the 5' end (approx. 2/3 of the primer), which instead of having ambiguous bases contains the most frequently occuring nucleotide at each position. This strategy resembles the widely-adopted Consensus-Degenerate Hybrid Oligonucleotide Primer (CODEHOP) principle (Rose et al., 1998), and aims to to allow amplification of highly variable targets using primers with low degeneracy. The idea is that the degenerate 3' end part will bind specifically to the target sequence in the initial PCR cycles, and promote amplification in spite of eventual mismatches at the 5' consensus part (since 5' end mismatches are generally less detrimental than 3' end mismatches). In this way, the generated products will match the 5' ends of all primers perfectly, which allows them to be efficiently amplified in later PCR cycles. To provide a sufficiently high tm in spite of mismatches, it is recommended to design relatively long primers (at least 25 bases) when using this strategy.

Probes are always designed using the ambiguous strategy.

Scoring system for oligos

All valid oligos are scored based on their identity, coverage and average GC content. The scoring system is presented below.

Identity and coverage

Value range Score
(0.99, 1] 0
(0.95, 0.99] 1
(0.90, 0.95] 2
\leq 0.90 3

Average GC-content

This score is based on how much the average GC-content deviates from 0.5 (in absolute value).

Value range Score
[0, 0.05) 0
[0.05, 0.1) 1
[0.1, 0.2) 2
\geq 0.2 3

These scores are summarized. The weight of each individual score is 1, and thus, the lowest and best possible score for an oligo is 0, and the worst possible score is 9.

Value

An RprimerOligo object, containing the following information:

type

Whether the oligo is a primer or probe.

fwd

TRUE if the oligo is valid in forward direction, FALSE otherwise.

rev

TRUE if the oligo is valid in reverse direction, FALSE otherwise.

start

Start position of the oligo.

end

End positon of the oligo.

length

Oligo length.

iupacSequence

Oligo sequence in IUPAC format (i.e. with ambiguous bases).

iupaSequenceRc

The reverse complement of the iupacSequence.

identity

For ambiguous oligos: average identity of the oligo. For mixed oligos: average identity of the 5' (consensus) part of the oligo. The value can range from 0 to 1.

coverage

For ambiguous oligos: average coverage of the oligo. For mixed oligos: average coverage of the 3' (degenerate) part of the oligo. The value can range from 0 to 1.

degeneracy

Number of sequence variants of the oligo.

gcContentMean

Mean GC-content of all sequence variants of the oligo.

gcContent

Range in GC-content of all sequence variants of the oligo.

tmMean

Mean tm of all sequence variants of the oligo (in Celcius degrees).

tm

Range in tm of all sequence variants of the oligo (in Celcius degrees).

deltaGMean

Mean delta G of all sequence variants of the oligo (in kcal/mol).

deltaG

Range in delta G of all sequence variants of the oligo (in kcal/mol).

sequence

All sequence variants of the oligo.

sequenceRc

Reverse complements of all sequence variants.

gcContent

GC-content of all sequence variants.

tm

Tm of all sequence variants (in Celcius degrees).

deltaG

Delta G of all sequence variants (in kcal/mol).

method

Design method used to generate the oligo: "ambiguous", "mixedFwd" or "mixedRev".

score

Oligo score, the lower the better.

roiStart

First position of the input RprimerProfile object (roi = region of interest).

roiEnd

Last position of the input RprimerProfile object.

An error message will return if no oligos are found. If so, a good idea could be to re-run the design process with relaxed constraints.

References

Rose, TM., Schultz ER., Henikoff JG., Pietrokovski S., McCallum CM., and Henikoff S. 1998. Consensus-Degenerate Hybrid Oligonucleotide Primers for Amplification of Distantly Related Sequences. Nucleic Acids Research 26 (7): 1628-35.

SantaLucia Jr, J., & Hicks, D. (2004). The thermodynamics of DNA structural motifs. Annu. Rev. Biophys. Biomol. Struct., 33, 415-440.

Examples

data("exampleRprimerProfile")
x <- exampleRprimerProfile

## Design primers and probes with default values
designOligos(x)

sofpn/rprimer documentation built on July 2, 2023, 7:15 a.m.