Description Usage Arguments Details Value Note Examples
Shuffle codons to generate sequences with different average codon pair scores relative to two poorly correlated reference codon pair biases.
1 2 3 4 5 | CPSdesign.dual(sequence, reference, referenceTwo, score, scoreTwo, start = 1,
end = NULL, cycles = NULL, buffer = 0.5, bind = c(1, 1, 1, 1),
transTable = standardTranslation, restrictSeqs = NULL,
complementary = FALSE, windowSize = NULL, save = FALSE, name = NULL,
location = NULL, draw = TRUE, silent = FALSE)
|
sequence |
Sequences can be input directly as a character string or as the file path to a fasta file. All sequences must be in the correct reading frame, stop codons or codons not defined in the translation table are not allowed and will generate an error. |
reference |
CPB reference table (see |
referenceTwo |
A second reference table for recoding relative to two CPB's. |
score |
Ideal score relative to the first reference. Input can be numeric, ‘min’, or ‘max’. |
scoreTwo |
Ideal score relative to the second reference. Format same as |
start |
Nucleotide position in the sequence. |
end |
Nucleotide position in the sequence. If NULL, the last in frame nucleotide position is used. |
cycles |
Optional input designating the number of recoding cycles. If empty, a minimal number of cycles is determined. Increasing the number of cycles may result in scores closer to the ideal value. |
buffer |
Controls the probability of alternating recoding preference toward the other reference during a single round of codon shuffling. Accepts numeric input. |
bind |
Controls the permissivity of scores greater than or less than the ideal score for each reference. Input is a four element numeric vector interpreted relative to each other. Position in the vector designates which direction and which reference. The first position controls scores less than the ideal first score, the second position controls scores greater than the ideal first score, and the third and fourth numbers control scores less than, and greater than the desired second score. Default values represent no bias toward either reference and direction. |
transTable |
Alternative translation tables can be used. |
restrictSeqs |
A string of comma separated sequences to remove or avoid in the input sequence while recoding. Search is performed 5' to 3' on the given strand. R regular expressions are allowed. |
complementary |
If TRUE search for restricted sequences is also performed on the complementary strand 5' to 3'. |
windowSize |
CPS line plots are smoothed by locally weighted polynomial regression where |
save |
Save recoded sequence in fasta file. |
name |
Output sequence name and fasta file name. |
location |
Save location of output sequence. |
draw |
If TRUE a line plot showing the local CPS along the length of the sequence is output to the graphics device during recoding. |
silent |
If TRUE output to the console is suppressed. |
An input sequence can be differentially recoded between two poorly correlated CPB references. A correlation test on two CPB reference tables can be computed directly with CPBcorr
. CPSdesign.dual will attempt to create a recoded sequence characterized by two ideal codon pair scores relative to two different CPBs. See listCPB
for a list of available CPB reference tables.
To get best results with dual reference recoding it is not advised to use 'max' and 'min' inputs for ideal score. It is better to first determine the range of possible scores relative to each reference alone, and then use specific scores when recoding for two references. Use the bind
argument if the 'max' or 'min' possible score is desired. The bind
argument controls the preference and direction of recoding for either reference. It takes a four element numeric vector input, the first position of the vector controls the permissivity of scores less than the ideal score for the first reference, the number in the second position controls scores greater than ideal for the first reference, and the third and fourth numbers control scores less than and greater than ideal for the second reference, respectively. Preference between directions and references is decided relative to each other, therefore any four identical numbers result in no biased preference. Increasing one value relative to the others will allow recoding to explore more sequences in the direction and reference designated by that position in the bind
vector. Output scores can be further optimized by increasing the buffer
value, if the ideal scores between two references are dramatically different or there is very little correlation between the reference CPBs. Increasing the buffer
will increase the probability of alternating between references during a single permutation, whereas normally each round of codon shuffling is performed relative to one reference at a time.
Recoded sequences can be saved as a fasta file with the save
argument. Additional information about the recoded sequence is returned as an invisible list.
firstoldCPS |
Average codon pair score of the input sequence relative to reference 1. |
firstnewCPS |
Average codon pair score of recoded sequence relative to reference 1. |
secondoldCPS |
Average codon pair score of the input sequence relative to reference 2. |
secondnewCPS |
Average codon pair score of recoded sequence relative to reference 2. |
codonchanges |
If codons were changed function will return an error. |
mutations |
Number of point mutations generated by recoding. |
oldCPSarray |
Vector of individual codon pair scores for the input sequence. |
newCPSarray |
Vector of individual codon pair scores for the recoded sequence. |
returnSeq |
The recoded sequence. |
If restricted sequences are given an additional item is returned:
restrSeqs |
The number of matches in the recoded sequence to the restricted sequences, value will include the complementary sequence is |
Both single and dual reference recoding offer the option of removing certain types of sequence elements by restricting which codons can be paired together. For example, restriction enzyme recognition sequences can be removed or prevented from appearing in the recoded sequence. These restricted sequences can be entered to the restrictSeqs
argument as a character string with sequences seperated by commas. A comprehensive list of restriction enzyme recognition sequences expressed as R regular expressions is provided in this package, see REseqs
. Restricted sequences are searched only on the single input strand, the complementary
arguement allows for searching on the reverse complement strand. Depending on the type and number of sequences that are restricted this functionality can dramatically slow down recoding and restrict the CPS.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | fastaLocation <- system.file('tbevns5.fasta', package = 'CPBias')
tbev <- importFasta(fastaLocation)[[1]]
# A dual CPS recoding of TBE virus relative to the CPB of two of its natural hosts,
# Homo.sapiens and Aedes aegypti. Design strategy is to make Homo.sapiens CPS as high
# as possible and less than WT CPS in Aedes.aegypti.
# If correlation between codon pair biases is too high, dual differential recoding may not
# be possible.
CPBcorr(Homo.sapiens, Aedes.aegypti)
# First estimate the possible range of scores relative to both hosts
HumMax <- CPSdesign.single(tbev, Homo.sapiens, 'max', silent=TRUE, draw=FALSE)[[2]]
AedMin <- CPSdesign.single(tbev, Aedes.aegypti, 'min', silent=TRUE, draw=FALSE)[[2]]
# Bind is set to prefer greater CPS in Homo.sapienss while greater CPS in Aedes
# is strongly prohibited. There is room to play with these settings.
tbevDual <- CPSdesign.dual(tbev, Homo.sapiens, Aedes.aegypti, .25, -.02,
bind=c(1,1000,1,.001))
tbevDual
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.