raxml: Maximum Likelihood Tree Estimation with RAxML

View source: R/raxml.R

raxmlR Documentation

Maximum Likelihood Tree Estimation with RAxML

Description

Provides an interface to the C program RAxML (see Reference section) for maximum likelihood estimation of tree topology and/or branch lengths, rapid and conventional non-parametric bootstrapping, mapping splits onto individual topologies, and a lot more. See the RAxML manual for details, especially if you are a new user of RAxML.

Usage

raxml(
  DNAbin,
  m = "GTRCAT",
  f,
  N,
  p,
  b,
  x,
  k,
  weights,
  partitions,
  outgroup,
  backbone = NULL,
  file = paste0("fromR_", Sys.Date()),
  exec,
  threads
)

Arguments

DNAbin

A matrix of DNA sequences of class DNAbin.

m

A vector of mode "character" defining a model of molecular evolution; currently only GTR model available.

f

A vector of mode "character" selecting an RAxML algorithm analogous to the -f flag (see Detail section and RAxML manual).

N

Either of mode "integer" or "character". Integers give the number of independent searches on different starting tree or replicates in bootstrapping. Alternatively, one of four bootstopping criteria can be chosen: "autoFC", "autoMR", "autoMRE", or "autoMRE_IGN".

p

Integer, setting a random seed for the parsimony starting trees.

b

Integer, setting a random seed for bootstrapping.

x

Integer, setting a random seed for rapid bootstrapping.

k

Logical, if TRUE, the branch lengths of bootstrapped trees are recorded.

weights

A vector of mode "numeric" giving integers to assign individual weights to each column of the alignment. (-a)

partitions

A data frame giving the partitions of the alignment.

outgroup

A vector of mode "character" containing the names of the outgroup taxa.

backbone

A phylo object representing a backbone tree.

file

A vector of mode "character" giving a name for the output files.

exec

A vector of mode "character" giving the path to the directory containing the RAxML executable. The default value will work on Mac OS X if the folder containing the executable is renamed to "RAxML-8.0.3".

threads

Integer, giving the number of parallel threads to use (PTHREADS only).

Details

There are some limitations of this wrapper compared to RAxML run directly from the command line.

  1. Only DNA is allowed as data type.

  2. Option f can only take a limited number of values (d, a).

RAxML needs the specification of random seeds for parsimony estimation of starting trees and for bootstrap resampling. The corresponding argument names in raxml are identical to the flags used by RAxML (-p, -b, and -x). If you choose not to give any values, raxml will generate a (different) value for each required random seed every time it is called. Be aware that set.seed will work only for p, but not for b or x.

Value

A list with a variable number of elements, depending on the analysis chosen:

"info" RAxML log file as character string
"bestTree" MLE of tree
"bipartitions" MLE of tree annotated with bootstrap proportions
"bootstrap" bootstrapped trees

Note

RAxML is a C program and the source code is not contained in this package. This means that in order to run this function you will need to install RAxML yourself. See https://cme.h-its.org/exelixis/web/software/raxml/ for the most recent documentation and source code of RAxML. Depending on where you chose to install RAxML, you need to adjust the exec argument.

raxml was last tested and running fine on Mac OS X with RAxML 8.0.29. Please be aware that calling third-party software from within R is a platform-specific process and I cannot guarantee that raxml will behave properly on any system.

References

(in chronological order)

Stamatakis, A., T. Ludwig and H. Meier. 2004. RAxML-III: A fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 1: 1–8.

Stamatakis, A. 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688–2690.

Stamatakis, A., P. Hoover, and J. Rougemont. 2008. A rapid bootstrap algorithm for the RAxML web-servers. Syst. Biol. 75: 758–771.

Pattengale, N. D., M. Alipour, O. R. P. Bininda-Emonds, B. M. E. Moret, and A. Stamatakis. 2010. How many bootstrap replicates are necessary? Journal of Computational Biology 17: 337-354.

Stamatakis, A. 2014. RAxML Version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics Advance Access.

See Also

raxml.partitions to store partitioning information in a data frame suitable for input as partitions argument in raxml.

Examples

## bark beetle sequences
data(ips.cox1)
data(ips.16S)
data(ips.28S)

ips <- cbind(ips.cox1, ips.16S, ips.28S,
            fill.with.gaps = TRUE)

exec <- "/Applications/RAxML-code/standard-RAxML/raxmlHPC-PTHREADS-AVX"
w <- sample(1:5, ncol(ips.cox1), replace = TRUE)

## Not run: 

# Simple tree search with GTRCAT and GTRGAMMA
tr <- raxml(ips.cox1, f = "d", N = 2, p = 1234,
           exec = exec) # -1743.528461
tr <- raxml(ips.cox1, m = "GTRGAMMA", f = "d", N = 2, p = 1234,
           exec = exec)
   
# Applying weights to columns                   
tr <- raxml(ips.cox1, f = "d", N = 2, p = 1234,
           weights = w, exec = exec) # -1743.528461

# Rapid bootstrap
tr <- raxml(ips.cox1, m = "GTRGAMMA",
           f = "a", N = 10, p = 1234, x = 1234,
           exec = exec)

# Rapid bootstrap with automatic halt
tr <- raxml(ips.cox1, m = "GTRGAMMA",
           f = "a", N = "autoMRE", p = 1234, x = 1234,
           exec = exec)

## End(Not run)

ips documentation built on May 29, 2024, 4:39 a.m.