mcTreeLike: Calculate the likelihood for Markov chain models on trees. In Shicheng-Guo/lyne: Modeling the dynamics of epigenetic marks along a cell lineage trees.

Description

Likelihood for a given alignment and tree model (tree and transition probabilities between states for each edge). Sums over missing data (elimination algorithm). General trees, any node can be missing.

Usage

 ```1 2 3 4 5 6 7 8``` ```mcTreeLike(ali, edgeMat, transProb, eqFreq, takeLog=FALSE) mcTreeLike.3states.inv(ali, mult=1, sumUp=TRUE, multMult=TRUE) mcTreeLike.3states.GTR.bhom(pars, ali, mult=1, edgeMat, takeLog=FALSE, logPars=FALSE, sumUp=TRUE) mcTreeLike.3states.GTR.bhom.gamma(pars, ali, mult=1, edgeMat, takeLog=FALSE, logPars=FALSE, sumUp=TRUE, details=FALSE) mcTreeLike.3states.GTR.bhom.gamma.inv(pars, ali, mult=1, edgeMat, takeLog=FALSE, logPars=FALSE, sumUp=TRUE, details=FALSE) optimize.mcTreeLike.3states.GTR.bhom(pars,ali,mult=1,edgeMat,takeLog=FALSE) optimize.mcTreeLike.3states.GTR.bhom.gamma(pars,ali,mult=1,edgeMat,takeLog=FALSE) optimize.mcTreeLike.3states.GTR.bhom.gamma.inv(pars,ali,mult=1,edgeMat,takeLog=FALSE) ```

Arguments

 `pars` Vector: Parameters describing the substitution model on the tree (see Details) `ali` Matrix: (number of tree nodes) x (number of observations). Contains integers, each representing a state in the Markov chain. Columns are observations, rows tree-nodes. `mult` Vector: Multiplicity of alignment columns. Allows the use of sufficient statistics. `edgeMat` Matrix: integer edge matrix of the tree in bottom-up order. Two columns: first is the from-nodes (indexes), second the to-nodes for each edge. Tree needs to be rooted. Nodes are integers that correspond to rows `ali`. `transProb` Array: transition matrix for each edge. (number of states) x (number of states) x (number of edges in tree) `eqFreq` Vector: The equilibrium frequencies at the root node of the tree. `takeLog` Logical: Should log be taken while calculating likelihood? `logPars` Logical: Are parameters `pars` provided on `log`-scale? `sumUp` Logical: Should the likelihood be summed over alignment columns, or should the likelihood for each column be reported separately? `details` Logical: For mixture models: Should the likelihood of each component be returned, or should the components be (weighted and) summed? `multMult` Logical: If multiplicities are given (`mult != 1`), and if `sumUp != TRUE`, should the log-likelihood for each observation be multiplied by its multiplicity?

Details

All function calculate the log-likelihood, given a model specification. `mcTreeLike` is the most generic, and is used by all the other functions (except `mcTreeLike.3state.inv`) after converting parameter values into transition probabilities. The matrix `edgeMat` contains integers referencing rows in `ali` as the tree nodes. It needs to be provided in bottom-up order, such that its rows provide a traversal from the leafs to the root of the tree.

`cTreeLike.3state.inv` Calculates the likelihood assuming invariable states only. It uses the frequencies of completely-observed (i.e., no missing data )invariable states in `ali` and does not need additional parameters. Likelihood of observations with more than one state is zero. Observations with only one state and missing data are assigned the likelihood of the compatible invariant state.

`mcTreeLike.3state.GTR.bhom` Branch-homogeneous model. The parameter vector is organized as follows:
`pars[1:3]`: Equilibrium frequencies
`pars[4:6]`: Rate parameters (1->3, 1->3, 2->3)
`pars[-(1:6)]`: Branch lengths of the tree

The functions with the `optimize` prefeix optimize the respective likelihood function, given data and initial parameters `pars`.

Value

`treeLike` returns a vector containing the likelihood for each column in the alignment matrix `ali`. If `takeLog == TRUE` it is on log scale.

Shicheng-Guo/lyne documentation built on May 10, 2017, 1:36 p.m.