Description Usage Arguments Details Value Note References Examples
Calculate the observed to expected frequency of all codon pairs for a given set of protein coding gene sequences.
1 2 |
sequences |
Input can be the location of a fasta file, or a character string vector. |
dnfControl |
If TRUE, III-I dinucleotide bias is factored out. |
transTable |
Translation table to use for identifying codons. See |
name |
Name the table if save is TRUE. |
save |
TRUE or FALSE to save the CPB reference table as a comma delimited csv file. |
location |
File path to save the csv file. |
silent |
If TRUE the progress bar is suppressed. |
There are 3,721 coding codon pairs, if using a standard translation table. The score (CPS) of each individual codon pair is determined by,
ln(codon pair[ab] x (amino acid[a] x amino acid[b]))/((codon[a] x codon[b]) x amino acid pair[ab])
Each value is measured as the relative frequency of the total. Tandem codon positions are marked a and b. A codon pair consists of 6 nucleotides, and counting is every three nucleotides along the sequence.
Sequences containing nucleotides not found in the translation table and sequences not divisible by three are excluded. Codon pairs containing codons undefined in the translation table, and codon pairs containing stop codons, will generate NA's and those codon pairs will not be included in the CPB calculation. By default, the standard translation table is used (see standardTranslation
). All input sequences should be in frame, protein coding (CDS) sequences.
A list with two elements is returned invisibly:
CPBtable |
CPB reference table containing all coding codon pairs and their individual CPS calculated with the above formula. This format can be used as the CPB reference input to the CPSdesign functions. |
complete.CPBtable |
Larger CPBtable containing the frequencies of all components of the codon pair score calculation. |
Codon pair bias, also called codon context, is typically regarded to be species specific. To this end CPB reference tables have been calculated for organisms which have whole genome CDS sequence data available. See listCPB
for a list of pre-calculated CPB reference tables.
Coleman JR, et al. 2008 Virus attenuation by genome-scale changes in codon pair bias. Science 320(5884):1784–1787.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | fastaLocation <- system.file('ccds.fasta', package = 'CPBias')
ccds <- importFasta(fastaLocation, sepSequences = TRUE)[[1]]
ccdsCPB <- CPBtable(ccds)
# First element in returned list is the CPB reference table
ccds.sample <- ccdsCPB[[1]]
# CPBtable will import sequences automatically if fasta location is given
ccdsCPB <- CPBtable(fastaLocation)[[1]]
head(ccdsCPB)
# Factor out dinucleotide bias between codons
dnCPB <- CPBtable(fastaLocation, dnfControl=TRUE)[[1]]
plot(ccdsCPB[,2], dnCPB[,2])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.