knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of {grantham}
is to provide a minimal set of routines to calculate
the Grantham distance1.
The Grantham distance attempts to provide a proxy for the evolutionary distance between two amino acids based on three key side chain chemical properties: composition, polarity and molecular volume. In turn, evolutionary distance is used as a proxy for the impact of missense substitutions. The higher the distance, the more deleterious the substitution is expected to be.
Install {grantham}
from CRAN:
install.packages("grantham")
You can install the development version of {grantham}
like so:
# install.packages("remotes") remotes::install_github("maialab/grantham")
Grantham distance between two amino acids:
library(grantham) grantham_distance(x = 'Ser', y = 'Phe')
The function grantham_distance()
is vectorised with amino acids being matched element-wise to form pairs for comparison:
grantham_distance(x = c('Ser', 'Arg'), y = c('Phe', 'Leu'))
The two vectors of amino acids must have compatible sizes in the sense of vec_recycle() for element recycling to be possible, i.e., either the two vectors have the same length, or one of them is of length one, and it is recycled up to the length of the other.
# `'Ser'` is recycled to match the length of the second vector, i.e. 3. grantham_distance(x = 'Ser', y = c('Phe', 'Leu', 'Arg'))
Use the function amino_acid_pairs()
to generate all 20 x 20 amino acid pairs:
aa_pairs <- amino_acid_pairs() aa_pairs
And now calculate all Grantham distances for all pairs aa_pairs
:
grantham_distance(x = aa_pairs$x, y = aa_pairs$y)
Because distances are symmetric, and for pairs formed by the same amino acid are trivially zero, you might want to exclude these pairs:
# `keep_self = FALSE`: excludes pairs such as ("Ser", "Ser") # `keep_reverses = FALSE`: excludes reversed pairs, e.g. ("Arg", "Ser") will be # removed because ("Ser", "Arg") already exists. aa_pairs <- amino_acid_pairs(keep_self = FALSE, keep_reverses = FALSE) # These amino acid pairs are the 190 pairs shown in Table 2 of Grantham's # original publication. aa_pairs # Grantham distance for the 190 unique amino acid pairs grantham_distance(x = aa_pairs$x, y = aa_pairs$y)
The Grantham distance $d_{i,j}$ for two amino acids $i$ and $j$ is:
$$d_{i,j} = \rho (\alpha (c_i-c_j)^2+\beta (p_i-p_j)^2+ \gamma (v_i-v_j)^2)^{1/2}\ .$$
The distance is based on three chemical properties of amino acid side chains:
We provide a data set with these properties:
amino_acids_properties
If you want to calculate the Grantham distance from these property values you
may use the function grantham_equation()
.
Other sources we've found in the R ecosystem that also provide code for calculation of the Grantham distance:
calculate_grantham()
, see Fetch_Grantham.R.{midasHLA}
package includes the unexported function distGrantham()
in utils.R.{HLAdivR}
package exports a data set with the Grantham distances in the format of a matrix, see data.R.{MSA2dist}
by Kristian K. Ullrich provides the
function
aastring2dist()
.Please note that the {grantham}
package is released with a Contributor Code
of Conduct.
By contributing to this project, you agree to abide by its terms.
1. Grantham, R. Amino acid difference formula to help explain protein evolution. Science 185, 862--864 (1974). doi: 10.1126/science.185.4154.862.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.