fitCorrSeq: Maximum likelihood estimate of evolutionary rates
In Caetanods/sequence-trait: Phylogenetic Comparative Methods to Study Sequence Traits and Correlated Evolution

View source: R/fitMLGammaSpanSpace.R

fitCorrSeq

R Documentation

Maximum likelihood estimate of evolutionary rates

Description

Function uses the discrete-gamma model of Yang (1994) to estimate variation in rates of evolution independent of sequence position and the auto-discrete-gamma model of Yang (1995) when rates are correlated among sites.

Usage

fitCorrSeq(data, phy, Q.model = "ER", rate.model = "gamma",
  root.type = "madfitz", add.invar = FALSE, poly.key = NULL,
  ncat = 4, init.M = FALSE, bounds = NULL, state.space = "max.div",
  gap.char = "-", opts = NULL, search.global = TRUE, init = NULL,
  verbose = TRUE, n.cores = 1)

Arguments

`data`	matrix with species names as rownames and sequence positions as the columns.
`phy`	ultrametric phylogeny in ape 'phylo' format.
`Q.model`	possible models are "ER" (single global rate) and "DEL" (a global rate for transitions between observed states and another rate for gains and loses of states).
`rate.model`	options are "correlated" , "gamma", and "single.rate". See Details below.
`root.type`	one of "madfitz", or "equal".
`add.invar`	TRUE or FALSE. Whether to compute transition rates for the invariant sites or not.
`poly.key`	a named list. Each element is a vector with the states represented by the polymorphism symbol. Names of the list need to match the polymorphism symbols in the data.
`ncat`	categories for the gamma function.
`init.M`	whether the initial state for the M matrix (for the correlated model) should have starting state equal to the gamma model (i.e., equal probabilities for all the transitions).
`bounds`	a numeric vector of length 2 with the lower and upper bonds for the rates.
`state.space`	can be "max.div", "site.div" or a numeric (length of 1 or equal to the number of columns in the data). The size of the state space is equal to the largest observed or equal to the per site state diversity plus the number.
`gap.char`	the character to be used as the gap in the data. Default is "-".
`opts`	the list of options for nloptr. If NULL it will use the default parameters of this function (not the same as the default for 'nloptr'). See more information in the help page for 'nloptr'.
`search.global`	whether to perform a global MLE search before the local MLE search. Default is TRUE.
`init`	set the initial parameters for the MLE search.
`verbose`	whether to print information to the screen.
`n.cores`	number of cores to perform the likelihood evaluation.

Details

This function implements the same model as the +G (plus Gamma) model commonly used in molecular evolution. Here we refer to this model as the "gamma" model. The main difference is that here the sequence is studied conditioned on a known phylogeny, where the molecular models are used to estimate the phylogenetic tree.

In addition, this function implements the not so common auto-discrete-gamma model (Yang 1995) that we refer as the "correlated" model. The function also implements the simplest case when all sequence positions share the same rate of evolution, we refer to this model as the "single.rate" model.

The "add.invar" parameter controls whether the model will include the estimate for invariant sites. Note that here we are not trying to estimate the branch lenghts of phylogenetic trees. For this reason, the transition rates estimated for the model represent only the sample and not the "overal molecular evolution rate of a lineage". Thus, it is not necessary to include the rate estimate for the invariant sequence positions. Of course, this is a prior decision and rely only on the objectives of the researcher. Changing this option changes the DATA for the analyses and not only the model. So conduct model choice between "add.invar == TRUE" and "add.invar == FALSE" models with care (if at all!).