cai: Codon Adaptation Index

View source: R/cai.R

caiR Documentation

Codon Adaptation Index


The Codon Adaptation Index (Sharp and Li 1987) is the most popular index of gene expressivity with about 1000 citations 20 years after its publication. Its values range from 0 (low) to 1 (high). The implementation here is intended to work exactly as in the program codonW written by by John Peden during his PhD thesis under the supervision of P.M. Sharp.


  cai(seq, w, numcode = 1, zero.threshold = 0.0001, = 0.01)



a coding sequence as a vector of single characters


a vector for the relative adaptiveness of each codon


the genetic code number as in translate


a value in w below this threshold is considered as zero

a value considered as zero in w is forced to this value. The default is from Bulmer (1988).


Adapted from the documentation of the CAI function in the program codonW writen by John Peden: CAI is a measurement of the relative adaptiveness of the codon usage of a gene towards the codon usage of highly expressed genes. The relative adaptiveness (w) of each codon is the ratio of the usage of each codon, to that of the most abundant codon for the same amino acid. The CAI index is defined as the geometric mean of these relative adaptiveness values. Non-synonymous codons and termination codons (genetic code dependent) are excluded. To aid computation, the CAI is calculated as using a natural log summation, To prevent a codon having a relative adaptiveness value of zero, which could result in a CAI of zero; these codons have fitness of zero (<.0001) are adjusted to 0.01.


A single numerical value for the CAI.


J.R. Lobry


Sharp, P.M., Li, W.-H. (1987) The codon adaptation index - a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Research, 15:1281-1295.

Bulmer, M. (1988). Are codon usage patterns in unicellular organisms determined by selection-mutation balance. Journal of Evolutionary Biology, 1:15-26.

Peden, J.F. (1999) Analysis of codon usage. PhD Thesis, University of Nottingham, UK.

The program codonW used here for comparison is available at under a GPL licence.


See Also

caitab for some w values from codonW. uco for codon usage tabulation.


# How to reproduce the results obtained with the C program codonW
# version 1.4.4 writen by John Peden. We use here the "input.dat"
# test file from codonW (Saccharomyces cerevisiae).
  inputdatfile <- system.file("sequences/input.dat", package = "seqinr")
  input <- read.fasta(file = inputdatfile) # read the FASTA file
# Import results obtained with codonW
  scucofile <- system.file("sequences/scuco.txt", package = "seqinr")
  scuco.res <- read.table(scucofile, header = TRUE) # read codonW result file
# Use w for Saccharomyces cerevisiae
  w <- caitab$sc
# Compute CAI and compare results:
  cai.res <- sapply(input, cai, w = w)
  plot(cai.res, scuco.res$CAI,
    main = "Comparison of seqinR and codonW results",
    xlab = "CAI from seqinR",
    ylab = "CAI from codonW",
    las = 1)

seqinr documentation built on April 6, 2023, 1:10 a.m.