multi.ace: Ancestral states estimations with multiple trees

View source: R/multi.ace.R

multi.aceR Documentation

Ancestral states estimations with multiple trees


Fast ancestral states estimations run on multiple trees using the Mk model from castor::asr_mk_model.


  models = "ER",
  threshold = TRUE,
  verbose = FALSE,
  parallel = FALSE,
  estimation.details = NULL



A matrix or list with the characters for each taxa.


A phylo or mutiPhylo object (if the tree argument contains node labels, they will be used to name the output).


A vector of models to be passed to castor::asr_mk_model.


either logical for applying a relative threshold (TRUE - default) or no threshold (FALSE) or a numeric value of the threshold (e.g. 0.95). See details.


optional, a named vector of special tokens to be passed to grep (make sure to protect the character with "\\"). By default special.tokens <- c(missing = "\\?", inapplicable = "\\-", polymorphism = "\\&", uncertainty = "\\/"). Note that NA values are not compared and that the symbol "@" is reserved and cannot be used.


optional, a list of one or more functions for a special behaviour for special.tokens. See details.


optional, a vector of branch length modifiers (e.g. to convert time branch length in changes branch length) or a list of vectors (the same length as tree).


logical, whether to be verbose (TRUE) or not (FALSE - default).


logical, whether to use parallel algorithm (TRUE) or not (FALSE - default).


optional, see Value section below.


optional, a named list of options to be passed to function called by castor::asr_mk_model.


optional, whether to also return the details for each estimation as returned by castor::asr_mk_model. This argument can be left NULL (default) or be any combination of the elements returned by castor::asr_mk_model (e.g. c("loglikelihood", "transition_matrix")).


The models argument can be a single or a list of transition matrix, a single or a a vector of built-in model(s) (see below) or a list of both matrices and built-in models: The available built-in models in castor::asr_mk_model are:

  • "ER" for all equal rates

  • "SYM" for symmetric rates

  • "ARD" all rates are different

  • "SUEDE" equal stepwise transitions (e.g. for meristic/counting characters)

  • "SRD" different stepwise transitions

See directly castor::asr_mk_model for more models.

The threshold option allows to convert ancestral states likelihoods into discrete states. When threshold = FALSE, the ancestral state estimated is the one with the highest likelihood (or at random if likelihoods are equal). When threshold = TRUE, the ancestral state estimated are all the ones that are have a scaled likelihood greater than the maximum observed scaled likelihood minus the inverse number of possible states (i.e. select_state >= (max(likelihood) - 1/n_states)). This option makes the threshold selection depend on the number of states (i.e. if there are more possible states, a lower scaled likelihood for the best state is expected). Finally using a numerical value for the threshold option (e.g. threshold = 0.95) will simply select only the ancestral states estimates with a scaled likelihood equal or greater than the designated value. This option makes the threshold selection absolute. Regardless, if more than one value is select, the uncertainty token (special.tokens["uncertainty"]) will be used to separate the states. If no value is selected, the uncertainty token will be use between all observed characters (special.tokens["uncertainty"]).

special.behaviours allows to generate a special rule for the special.tokens. The functions should can take the arguments character, all_states with character being the character that contains the special token and all_states for the character (which is automatically detected by the function). By default, missing data returns and inapplicable returns all states, and polymorphisms and uncertainties return all present states.

  • missing = function(x,y) y

  • inapplicable = function(x,y) y

  • polymorphism = function(x,y) strsplit(x, split = "\\&")[[1]]

  • uncertainty = function(x,y) strsplit(x, split = "\\/")[[1]]

Functions in the list must be named following the special token of concern (e.g. missing), have only x, y as inputs and a single output a single value (that gets coerced to integer automatically). For example, the special behaviour for the special token "?" can be coded as: special.behaviours = list(missing = function(x, y) return(NA) to make ignore the character for taxa containing "?".

When using the parallel option (either through using parallel = TRUE by using the number of available cores minus on or manually setting the number of cores - e.g. parallel = 5), the castor::asr_mk_model function will use the designated number of cores (using the option Nthreads = <requested_number_of_cores>). Additionally, if the input tree is a "multiPhylo" object, the trees will be run in parallel for each number of cores, thus decreasing computation time accordingly (e.g. if 3 cores are requested and tree contains 12 "phylo" objects, 4 different "phylo" objects will be run in parallel on the 3 cores making the calculation around 3 times faster).


Returns a "matrix" or "list" of ancestral states. By default, the function returns the ancestral states in the same format as the input matrix. This can be changed using the option output = "matrix" or "list" to force the class of the output. To output the combined ancestral states and input, you can use "combined" (using the input format) or "combined.matrix" or "combined.list".


Thomas Guillerme

See Also

castor::asr_mk_model, char.diff


## A simple example:
## A random tree with 10 tips
tree <- rcoal(10)
## Setting up the parameters
my_rates = c(rgamma, rate = 10, shape = 5)

## A random Mk matrix (10*50)
matrix_simple <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates,
                            invariant = FALSE)

## Run a basic ancestral states estimations
ancestral_states <- multi.ace(matrix_simple, tree)
ancestral_states[1:5, 1:5]

## A more complex example
## Create a multiple list of 5 trees
multiple_trees <- rmtree(5, 10)

## Modify the matrix to contain missing and special data
matrix_complex <- matrix_simple
matrix_complex[sample(1:length(matrix_complex), 50)] <- "-"
matrix_complex[sample(1:length(matrix_complex), 50)] <- "0%2"
matrix_complex[sample(1:length(matrix_complex), 50)] <- "?"

## Set a list of extra special tokens
my_spec_tokens <- c("weirdtoken" = "%")

## Set some special behaviours for the "weirdtoken" and for "-" and "?"
my_spec_behaviours <- list()
## Inapplicable tokens "-" are ignored
my_spec_behaviours$inapplicable <- function(x,y) return(NA)
## Missing tokens "?" are considered as all states
my_spec_behaviours$missing      <- function(x,y) return(y)
## Weird tokens are considered as state 0 and 3
my_spec_behaviours$weirdtoken   <- function(x,y) return(c(1,2))

## Create a random branch length modifier to apply to each tree
branch_lengths <- rnorm(18)^2

## Setting a list of model ("ER" for the 25 first characters and then "SYM")
my_models <- c(rep("ER", 25), rep("SYM", 25))

## Run the ancestral states on all the tree with multiple options
ancestral_states <- multi.ace(matrix_complex, multiple_trees,
                              verbose = TRUE,
                              models = my_models,
                              threshold = 0.95,
                              special.tokens = my_spec_tokens,
                              special.behaviours = my_spec_behaviours,
                              brlen.multiplier = branch_lengths,
                              output = "combined.matrix")

## The results for the the two first characters for the first tree
ancestral_states[[1]][, 1:2]

## Not run: 
## The same example but running in parallel
ancestral_states <- multi.ace(matrix_complex, multiple_trees,
                              verbose = TRUE,
                              models = my_models,
                              threshold = 0.95,
                              special.tokens = my_spec_tokens,
                              special.behaviours = my_spec_behaviours,
                              brlen.multiplier = branch_lengths,
                              output = "combined.matrix",
                              parallel = TRUE)

## End(Not run)

dispRity documentation built on May 29, 2024, 9:40 a.m.