fastSimmap: Fast implementation of stochastic mapping.
In ratematrix: Bayesian Estimation of the Evolutionary Rate Matrix

fastSimmap

R Documentation

Fast implementation of stochastic mapping.

Description

Make a stochastic map simulation conditioned on a Markov matrix 'Q' and a vector of root probabilities 'pi'.

Usage

fastSimmap(
  tree,
  x,
  Q,
  pi = "equal",
  mc.cores = 1,
  max_nshifts = 200,
  silence = FALSE
)

Arguments

`tree`	a phylogenetic tree of class 'phylo'.
`x`	a named vector with the states observed at the tips of the tree.
`Q`	a Markov transition matrix for the Markov Model. This needs to be provided and the user can estimate such matrix from the observed data using any of a multitude of methods.
`pi`	Should be 'equal' or 'madfitz'. A numeric vector of prior probabilites for each state with length equal to the number of states in the data is also accepted. The order of the vector needs to be the same as the columns of Q.
`mc.cores`	same as in 'parallel::mclapply'. This is used to make multiple simulations (controlled with the argument 'nsim') by calling 'parallel::mclapply'.
`max_nshifts`	allocate the max number of events in any given branch. See 'Details'.
`silence`	if function should skip data format checks tests and stop printing messages.

Details

This function is a simplification of Revell's 'phytools::make.simmap' function. Here the stochastic mapping is performed conditioned on a given Markov matrix and a vector of probabilities for the root node. This allows users to fit the Mk model using any preferred method and use this function to perform the stochastic mapping on the tree.

The function returns a single stochastic map in the 'simmap' format. In order to get multiple simulations, simply call this function multiple times using 'lapply', see 'Examples'.

The prior probabilities at the root can be set to "equal" (i.e., all states have the same probability to be observed at the root) or to "madfitz" (i.e., state probabilities follow the likelihood of the Mk model).

The argument 'max_nshifts' controls the size of the "memory buffer" that records the number of state changes in any given branch of the phylogeny. It DOES NOT influence the outcome of the stochastic character map simulations. Set this value to a high enough number (i.e., more changes that can possibly happen at any given branch). If the limit is reached the function will print a message and return a value of 0.0 instead of the stochastic map. If that happens, simply increase the number of 'max_nshifts' and run again. This is only a limitation of the computer algorithm used to speed up the simulation and DOES NOT affect the results in any way.

The reduced time is accomplished by using compiled code to perfom simulations ( C++ ). All calculations are the same as Revell's original function.

If some of the states in the transition matrix "Q" are not present among the observed tips of the phylogeny the function will return some warning messages. The stochastic mapping will work properly however. Please check that ALL states among the tips of the phylogeny are represented on some of the columns (and rows) of the transition matrix "Q".

Note that if root probabilities are set by the user, the element "$logL" of the output list object will be the log-likelihood for the model computed with equal root probabilities for each state. This value is not used in any computation and only displayed as a reference. This issue will likely not be fixed.

Value

A stochastic mapped phylogeny of class 'simmap' or a value of 0 if 'max_nshifts' is reached. Please see 'Details'.

Author(s)

Daniel Caetano

Examples


## Load data
data(anoles)
area <- setNames(object = as.character(anoles$data$area), nm = rownames(anoles$data))
phy <- mergeSimmap(phy = anoles$phy[[1]], drop.regimes = TRUE)
## Define a transition matrix. This can be estimated using MLE or MCMC.
## Building one as an example.
Q <- matrix(0.0007, nrow = 2, ncol = 2)
diag(Q) <- diag(Q) * -1
colnames(Q) <- unique(area)
## Generate 10 stochastic mappings using lapply:
maps <- lapply(1:10, function(x) fastSimmap(tree = phy, x = area, pi = "equal", Q = Q))
## Now using a simple for loop.
maps <- vector(mode = "list", length = 10)
for( i in 1:10 ) maps[[i]] <- fastSimmap(tree = phy, x = area, pi = "equal", Q = Q)

ratematrix documentation built on June 3, 2022, 9:06 a.m.