Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/DesignSignatures.R
Aids the design of pairs of primers for amplifying a unique “signature” from each group of sequences. Signatures are distinct PCR products that can be differentiated by their length, melt temperature, or sequence.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | DesignSignatures(dbFile,
tblName = "Seqs",
identifier = "",
focusID = NA,
type = "melt",
resolution = 0.5,
levels = 10,
enzymes = NULL,
minLength = 17,
maxLength = 26,
maxPermutations = 4,
annealingTemp = 64,
P = 4e-07,
monovalent = 0.07,
divalent = 0.003,
dNTPs = 8e-04,
minEfficiency = 0.8,
ampEfficiency = 0.5,
numPrimerSets = 100,
minProductSize = 70,
maxProductSize = 400,
kmerSize = 8,
searchPrimers = 500,
maxDictionary = 20000,
primerDimer = 1e-07,
pNorm = 1,
taqEfficiency = TRUE,
processors = 1,
verbose = TRUE)
|
dbFile |
A SQLite connection object or a character string specifying the path to the database file. |
tblName |
Character string specifying the table where the DNA sequences are located. |
identifier |
Optional character string used to narrow the search results to those matching a specific identifier. Determines the target group(s) for which primers will be designed. If "" then all identifiers are selected. |
focusID |
Optional character string specifying which of the |
type |
Character string indicating the type of signature being used to differentiate the PCR products from each group. This should be (an abbreviation of) one of |
resolution |
Numeric specifying the “resolution” of the experiment, or a vector giving the boundaries of bins. (See details section below.) |
levels |
Numeric giving the number of “levels” that can be distinguished in each bin. (See details section below.) |
enzymes |
Named character vector providing the cut sites of one or more restriction enzymes. Cut sites must be delineated in the same format as |
minLength |
Integer providing the minimum length of primers to consider in the design. |
maxLength |
Integer providing the maximum length of primers to consider in the design. |
maxPermutations |
Integer providing the maximum number of permutations allowed in a forward or reverse primer to attain greater coverage of sequences. |
annealingTemp |
Numeric indicating the desired annealing temperature that will be used in the PCR experiment. |
P |
Numeric giving the molar concentration of primers in the reaction. |
monovalent |
The molar concentration of monovalent ([Na] and [K]) ions in solution that will be used to determine a sodium equivalent concentration. |
divalent |
The molar concentration of divalent ([Mg]) ions in solution that will be used to determine a sodium equivalent concentration. |
dNTPs |
Numeric giving the molar concentration of free nucleotides added to the solution that will be used to determine a sodium equivalent concentration. |
minEfficiency |
Numeric giving the minimum efficiency of hybridization desired for the primer set. |
ampEfficiency |
Numeric giving the minimum efficiency required for theoretical amplification of the primers. Note that |
numPrimerSets |
Integer giving the optimal number of primer sets (forward and reverse primer sets) to design. |
minProductSize |
Integer giving the minimum number of nucleotides desired in the PCR product. |
maxProductSize |
Integer giving the maximum number of nucleotides desired in the PCR product. |
kmerSize |
Integer giving the size of k-mers to use in the preliminary search for potential primers. |
searchPrimers |
Numeric specifying the number of forward and reverse primers to use in searching for potential PCR products. A lower value will result in a faster search, but potentially neglect some useful primers. |
maxDictionary |
Numeric giving the maximum number of primers to search for simultaneously in any given step. |
primerDimer |
Numeric giving the maximum amplification efficiency of potential primer-dimer products. |
pNorm |
Numeric specifying the power (p > 0) used in calculating the L\textsuperscript{p}-norm when scoring primer pairs. By default (p = 1), the score is equivalent to the average difference between pairwise signatures. When p < 1, many small differences will be preferred over fewer large differences, and vise-versa when p > 1. This enables prioritizing primer pairs that will yield a greater number of unique signatures (p < 1), or fewer distinct, but more dissimilar, signatures (p > 1). |
taqEfficiency |
Logical determining whether to make use of elongation efficiency to increase predictive accuracy for Taq DNA Polymerase amplifying primers with mismatches near the 3' terminus. Note that this should be set to FALSE if using a high-fidelity polymerase with 3' to 5' exonuclease activity. |
processors |
The number of processors to use, or |
verbose |
Logical indicating whether to display progress. |
Signatures are group-specific PCR products that can be differentiated by either their melt temperature profile, length, or sequence. DesignSignatures
assists in finding the optimal pair of forward and reverse primers for obtaining a distinguishable signature from each group of sequences. Groups are delineated by their unique identifier
in the database. The algorithm works by progressively narrowing the search for optimal primers: (1) the most frequent k-mers are found; (2) these are used to design primers initially matching the focusID
group; (3) the most common forward and reverse primers are selected based on all of the groups, and ambiguity is added up to maxPermutations
; (4) a final search is performed to find the optimal forward and reverse primer. Pairs of primers are scored by the distance between the signatures generated for each group, which depends on the type
of experiment.
The arguments resolution
and levels
control the theoretical resolving power of the experiment. The signature
for a group is discretized or grouped into “bins” each with a certain magnitude of the signal. Here resolution
determines the separation between distinguishable “bins”, and levels
controls the range of values in each bin. A high-accuracy experiment would have many bins and/or many levels. While levels
is interpreted similarly for every type
of experiment, resolution
is treated differently depending on type
. If type
is "melt"
, then resolution
can be either a vector of different melt temperatures, or a single number giving the change in temperatures that can be differentiated. A high-resolution melt (HRM) assay would typically have a resolution between 0.25 and 1 degree Celsius. If type
is "length"
then resolution is either the number of bins between the minProductSize
and maxProductSize
, or the bin boundaries. For example, resolution
can be lower (wider bins) at long lengths, and higher (narrower bins) at shorter lengths. If type
is "sequence"
then resolution
sets the k-mer size used in differentiating amplicons. Oftentimes, 4 to 6-mers are used for the classification of amplicons.
The signatures can be diversified by using a restriction enzyme to digest the PCR products when type
is "melt"
or "length"
. If enzymes
are supplied then the an additional search is made to find the best enzyme to use with each pair of primers. In this case, the output includes all of the primer pairs, as well as any enzymes
that will digest the PCR products of that primer pair. The output is re-scored to rank the top primer pair and enzyme combination. Note that enzymes
is inapplicable when type
is "sequence"
because restriction enzymes do not alter the sequence of the DNA. Also, it is recommended that only a subset of the available RESTRICTION_ENZYMES
are used as input enzymes
in order to accelerate the search for the best enzyme.
A data.frame
with the top-scoring pairs of forward and reverse primers, their score, the total number of PCR products, and associated columns for the restriction enzyme (if enzyme
is not NULL
).
Erik Wright eswright@pitt.edu
Wright, E.S. & Vetsigian, K.H. (2016) "DesignSignatures: a tool for designing primers that yields amplicons with distinct signatures." Bioinformatics, doi:10.1093/bioinformatics/btw047.
AmplifyDNA
, CalculateEfficiencyPCR
, DesignPrimers
, DigestDNA
, Disambiguate
, MeltDNA
, RESTRICTION_ENZYMES
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | # below are suggested inputs for different types of experiments
db <- system.file("extdata", "Bacteria_175seqs.sqlite", package="DECIPHER")
## Not run:
# High Resolution Melt (HRM) assay:
primers <- DesignSignatures(db,
resolution=seq(75, 100, 0.25), # degrees Celsius
minProductSize=55, # base pairs
maxProductSize=400)
# Primers for next-generation sequencing:
primers <- DesignSignatures(db,
type="sequence",
minProductSize=300, # base pairs
maxProductSize=700,
resolution=5, # 5-mers
levels=5)
# Primers for community fingerprinting:
primers <- DesignSignatures(db,
type="length",
levels=2, # presence/absence
minProductSize=200, # base pairs
maxProductSize=1400,
resolution=c(seq(200, 700, 3),
seq(705, 1000, 5),
seq(1010, 1400, 10)))
# Primers for restriction fragment length polymorphism (RFLP):
data(RESTRICTION_ENZYMES)
myEnzymes <- RESTRICTION_ENZYMES[c("EcoRI", "HinfI", "SalI")]
primers <- DesignSignatures(db,
type="length",
levels=2, # presence/absence
minProductSize=200, # base pairs
maxProductSize=600,
resolution=c(seq(50, 100, 3),
seq(105, 200, 5),
seq(210, 600, 10)),
enzymes=myEnzymes)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.