knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

grantham

CRAN status

The goal of {grantham} is to provide a minimal set of routines to calculate the Grantham distance1.

The Grantham distance attempts to provide a proxy for the evolutionary distance between two amino acids based on three key side chain chemical properties: composition, polarity and molecular volume. In turn, evolutionary distance is used as a proxy for the impact of missense substitutions. The higher the distance, the more deleterious the substitution is expected to be.

Installation

Install {grantham} from CRAN:

install.packages("grantham")

You can install the development version of {grantham} like so:

# install.packages("remotes")
remotes::install_github("maialab/grantham")

Usage

Grantham distance between two amino acids:

library(grantham)

grantham_distance(x = 'Ser', y = 'Phe')

The function grantham_distance() is vectorised with amino acids being matched element-wise to form pairs for comparison:

grantham_distance(x = c('Ser', 'Arg'), y = c('Phe', 'Leu'))

The two vectors of amino acids must have compatible sizes in the sense of vec_recycle() for element recycling to be possible, i.e., either the two vectors have the same length, or one of them is of length one, and it is recycled up to the length of the other.

# `'Ser'` is recycled to match the length of the second vector, i.e. 3.
grantham_distance(x = 'Ser', y = c('Phe', 'Leu', 'Arg'))

Use the function amino_acid_pairs() to generate all 20 x 20 amino acid pairs:

aa_pairs <- amino_acid_pairs()
aa_pairs

And now calculate all Grantham distances for all pairs aa_pairs:

grantham_distance(x = aa_pairs$x, y = aa_pairs$y)

Because distances are symmetric, and for pairs formed by the same amino acid are trivially zero, you might want to exclude these pairs:

# `keep_self = FALSE`: excludes pairs such as ("Ser", "Ser")
# `keep_reverses = FALSE`: excludes reversed pairs, e.g. ("Arg", "Ser") will be
# removed because ("Ser", "Arg") already exists.
aa_pairs <- amino_acid_pairs(keep_self = FALSE, keep_reverses = FALSE)

# These amino acid pairs are the 190 pairs shown in Table 2 of Grantham's
# original publication.
aa_pairs

# Grantham distance for the 190 unique amino acid pairs
grantham_distance(x = aa_pairs$x, y = aa_pairs$y)

The Grantham distance $d_{i,j}$ for two amino acids $i$ and $j$ is:

$$d_{i,j} = \rho (\alpha (c_i-c_j)^2+\beta (p_i-p_j)^2+ \gamma (v_i-v_j)^2)^{1/2}\ .$$

The distance is based on three chemical properties of amino acid side chains:

We provide a data set with these properties:

amino_acids_properties

If you want to calculate the Grantham distance from these property values you may use the function grantham_equation().

Related software

Other sources we've found in the R ecosystem that also provide code for calculation of the Grantham distance:

Code of Conduct

Please note that the {grantham} package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

References

1. Grantham, R. Amino acid difference formula to help explain protein evolution. Science 185, 862--864 (1974). doi: 10.1126/science.185.4154.862.



maialab/grantham documentation built on Aug. 1, 2024, 2:32 a.m.