sfreemap: Simulation free stochastic character mapping on a...

View source: R/sfreemap.R

sfreemapR Documentation

Simulation free stochastic character mapping on a phylogenetic tree

Description

This function performs an analitic stochastic character mapping on a phylogenetic tree (algorithym proposed by Minin and Suchard).

It can be called with a combination of parameters, much like any vectorized function in R. In other words, calling it with N trees (multiPhylo object) and a single rate matrix Q will return N mapped trees. Calling sfreemap with a single tree (phylo object) and M Q matrices will result in M mapped trees, replicas of the single tree with the algorithm applied to it using all Q matrices. Same logic applies to prior as well. It is important to note though that it you pass on N > 1 trees and M > 1 rate matrices (or priors), M and N should be equal.

Usage

sfreemap(tree, tip_states, Q=NULL, type="standard", model="SYM"
    , method="empirical", ...)

Arguments

tree

a phylogenetic tree as an object of class "phylo", or a list of trees as an object of class "multiPhylo".

tip_states

Two formats are accepted:

  • A named vector containing the states of the nodes at the tips as values, and the taxa labels as names;

  • A matrix with characters as columns, tip labels as rows and the state as values.

Q

The transition rate matrix. Can be given as a matrix with state names on dimensons or estimated by the program. Options for estimation depend on argumentos method, model and type.

type

The type of the tip_states being analysed. It can be "standard", usually used for morphological characters, or "dna", for nucleotideos. Default to "standard".

model

By choosing to estimate Q user can then select a model.

When type="standard" the available methods are:

  • "SYM" (default): symmetrical model, e.g, matric(c(0,1,2,1,0,3,2,3,0));

  • "ER": equal rates model, for example matrix(c(0,1,1,0));

  • "ARD": all rates different model, for example matrix(c(0,1,2,0));

When type="dna" the available methods are JC, F81, K80, HKY, TrNe, TrN, TPM1, K81, TPM1u, TPM2, TPM2u, TPM3, TPM3u, TIM1e, TIM1, TIM2e, TIM2, TIM3e, TIM3, TVMe, TVM, SYM and GTR.

method

The method argument is only used when type='standard' and the available options are:

  • empirical (default): first it fits a continuous-time reversible Markov model for the evolution of x and then simulates stochastic character histories using that model and the tip states on the tree. This is the same procedure that is described in Bollback (2006), except that simulation is performed using a fixed value of the transition matrix, Q, instead of by sampling Q from its posterior distribution (phytools).

  • "mcmc": samples n_simulation Q matrices from the posterior probability distribution of Q using MCMC, then performs stochastic maps conditioned on each sampled value of Q.

...

Optional parameters, listed below:

  • "prior": the prior distribution on the root node of the tree. Options are:

    • "equal" (default): root node is sampled from the conditional scaled likelihood distribution at the root;

    • "estimated": the stationary distribution is estimated by numerically solving pi*Q=0;

  • "tol" (default: 1e-8): the tolerance for zero elements in Q, elements less then tol will be set to tol;

  • "parallel" (default: TRUE): when tree is of type multiPhylo we can run sfreemap in parallel. The number of processes created will be the same as the cores available in your machine.

  • When Q="mcmc" some other parameters might be set:

    • "n_simulations" (default: 100): The number of Q matrices that will be generated;

    • "burn_in" (default: 1000): the burn in for the MCMC;

    • "sample_freq" (default: 100): number of generations for each sample taken;

Value

Returns a modified object of class "phylo", adding the following data:

mapped.edge

a matrix containing the expected value for dwelling times for each state along each edge of the tree;

mapped.edge.lmt

a matrix containing the expected number of labelled markov transitions for each state along each edge of the tree;

Q

the given or estimated value of Q;

logL

The likelihood of calculated for the given or sampled Q;

prior

The prios given or calculated for the root node;

Author(s)

Diego Pasqualin dpasqualin@inf.ufpr.br

References

Vladimir N Minin e Marc A Suchard. Fast, accurate and simulation-free stochastic mapping. Philosophical Transactions of the Royal Society B: Biological Sciences, 363 (1512): 3985-3995, 2008.


dpasqualin/sfreemapc documentation built on July 5, 2023, 10:52 a.m.