R interfaces for protdist

Share:

Description

This function is an R interface for protdist in the PHYLIP package (Felsenstein 2013). protdist can be used to estimate the evolutionary distances between amino acid sequences under various models.

Usage

1
Rprotdist(X, path=NULL, ...)

Arguments

X

an object of class "proseq" containing aligned amino acid sequences.

path

path to the executable containing protdist. If path = NULL, the R will search sev arguments to be passed to protdist. See details for more information.

...

optional arguments to be passed to protdist. See details for more information.

Details

Optional arguments include the following: quiet suppress some output to R console (defaults to quiet = FALSE); model can be "JTT" (Jones et al. 1992), "PMB" (Veerassamy et al. 2003), "PAM" (Dayhoff & Eck 1968; Dayhoff et al. 1979; Koisol & Goldman 2005), "Kimura" (a simple model based on Kimura 1980), "similarity" which gives the similarity between sequences, and "categories" which is due to Felsenstein; gamma alpha shape parameter of a gamma model of rate heterogeneity among sites (defaults to no gamma rate heterogeneity) - note that gamma rate heterogeneity does not apply to model = "Kimura" or model = "similarity"; kappa transition:transversion ratio (defaults to kappa = 2.0), genetic.code, type of genetic code to assume (options are "universal", the default, "mitochondrial", "vertebrate.mitochondrial", "fly.mitochondrial", and "yeast.mitochondrial"), categorization, categorization scheme for amino acids (options are "GHB", the George et al. 1988 classification, "Hall", a classification scheme provided by Ben Hall, and "chemical", a scheme based on Conn & Stumpf 1963); and, finally, ease, a numerical parameter that indicates the facility of getting between amino acids of different categories in which 0 is nearly impossible, and 1 is no difficulty (defaults to ease = 0.457) - note that kappa, bf, genetic.code, categorization, and ease apply only to model = "categories"; rates vector of rates (defaults to single rate); rate.categories vector of rate categories corresponding to the order of rates; weights vector of weights of length equal to the number of columns in X (defaults to unweighted); and cleanup remove PHYLIP input & output files after the analysis is completed (defaults to cleanup = TRUE).

More information about the protdist program in PHYLIP can be found here http://evolution.genetics.washington.edu/phylip/doc/protdist.html.

Obviously, use of any of the functions of this package requires that PHYLIP (Felsenstein 1989, 2013) should first be installed. Instructions for installing PHYLIP can be found on the PHYLIP webpage: http://evolution.genetics.washington.edu/phylip.html.

Value

This function returns an object of class "dist".

Author(s)

Liam J. Revell, Scott A. Chamberlain

Maintainer: Liam J. Revell <liam.revell@umb.edu>

References

Conn, E.E., Stumpf, P.K. (1963) Outlines of Biochemistry. John Wiley and Sons, New York.

Dayhoff, M.O., Eck, R.V. (1968) Atlas of Protein Sequence and Structure 1967-1968. National Biomedical Research Foundation, Silver Spring, Maryland.

Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C. (1979) A model of evolutionary change in proteins. pp. 345-352 in Atlas of Protein Sequence and Structure, Volume 5, Supplement 3, 1978, ed. M.O. Dayhoff. National Biomedical Research Foundataion, Silver Spring, Maryland.

Felsenstein, J. (1989) PHYLIP–Phylogeny Inference Package (Version 3.2). Cladistics, 5, 164-166.

Felsenstein, J. (2013) PHYLIP (Phylogeny Inference Package) version 3.695. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle.

George, D.G., Hunt, L.T., Barker., W.C. (1988) Current methods in sequence comparison and analysis. pp. 127-149 in Macromolecular Sequencing and Synthesis, ed. D. H. Schlesinger. Alan R. Liss, New York.

Jones, D.T., Taylor, W.R., Thornton, J.M. (1992) The rapid generation of mutation data matrices from protein sequences. Computer Applications in the Biosciences (CABIOS), 8, 275-282.

Kimura, M. (1980) A simple model for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111-120.

Koisol, C., Goldman, N. (2005) Different versions of the Dayhoff rate matrix. Molecular Biology and Evolution, 22, 193-199.

Veerassamy, S., Smith, A., Tillier, E.R. (2003) A transition probability model for amino acid substitutions from blocks. Journal of Computational Biology, 10, 997-1010.

See Also

Rneighbor

Examples

1
2
3
4
5
6
## Not run: 
data(chloroplast)
D<-Rprotdist(chloroplast,model="PAM")
tree<-Rneighbor(D)

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.