dater: Estimate a time-scaled tree and fit a molecular clock

Description Usage Arguments Details Value References Author(s) See Also Examples

View source: R/treedater0.R

Description

Estimate a time-scaled tree and fit a molecular clock

Usage

1
2
3
4
5
6
7
dater(tre, sts, s = 1000, omega0 = NA, minblen = NA, maxit = 100,
  abstol = 1e-04, searchRoot = 5, quiet = TRUE,
  temporalConstraints = TRUE, clock = c("strict", "uncorrelated",
  "additive"), estimateSampleTimes = NULL,
  estimateSampleTimes_densities = list(), numStartConditions = 1,
  clsSolver = c("limSolve", "mgcv"), meanRateLimits = NULL, ncpu = 1,
  parallel_foreach = FALSE)

Arguments

tre

An ape::phylo which describes the phylogeny with branches in units of substitutions per site. This may be a rooted or unrooted tree. If unrooted, the root position will be estimated by checking multiple candidates chosen by root-to-tip regression. If the tree has multifurcations, these will be resolved and a binary tree will be returned.

sts

Vector of sample times for each tip in phylogenetic tree. Vector must be named with names corresponding to tre$tip.label.

s

Sequence length (numeric). This should correspond to sequence length used in phylogenetic analysis and will not necessarily be the same as genome length.

omega0

Vector providing initial guess or guesses of the mean substitution rate (substitutions per site per unit time). If not provided, will guess using root to tip regression.

minblen

Minimum branch length in calendar time. By default, this will be the range of sample times (max - min) divided by sample size.

maxit

Maximum number of iterations

abstol

Difference in log likelihood between successive iterations for convergence.

searchRoot

Will search for the optimal root position using the top matches from root-to-tip regression. If searchRoot=x, dates will be estimated for x trees, and the estimate with the highest likelihood will be returned.

quiet

If TRUE, will suppress messages during execution

temporalConstraints

If TRUE, will enforce the condition that an ancestor node in the phylogeny occurs before all progeny. Equivalently, this will preclude negative branch lengths. Note that execution is faster if this option is FALSE.

clock

The choice of molecular clock model. Choices are 'uncorrelated', 'additive', or 'strict'.

estimateSampleTimes

If some sample times are not known with certainty, bounds can be provided with this option. This should take the form of a data frame with columns 'lower' and 'upper' providing the sample time bounds for each uncertain tip. Row names of the data frame should correspond to elements in tip.label of the input tree. Tips with sample time bounds in this data frame do not need to appear in the *sts* argument, however if they are included in *sts*, that value will be used as a starting condition for optimisation.

estimateSampleTimes_densities

An optional named list of log densities which would be used as priors for unknown sample times. Names should correspond to elements in tip.label with uncertain sample times.

numStartConditions

Will attempt optimisation from more than one starting point if >0

clsSolver

Which package should be used for constrained least-squares? Options are "mgcv" or "limSolve"

meanRateLimits

Optional constraints for the mean substitution rate

ncpu

Number of threads for parallel computing

parallel_foreach

If TRUE, will use the "foreach" package instead of the "parallel" package. This may work better on some HPC systems.

Details

Estimates the calendar time of nodes in the given phylogenetic tree with branches in units of substitutions per site. The calendar time of each sample must also be specified and the length of the sequences used to estimate the tree. If the tree is not rooted, this function will estimate the root position. For an introduction to all options and features, see the vignette on Influenza H3N2: vignette("h3n2")

Multiple molecular clock models are supported including a strict clock and two variations on relaxed clocks. The 'uncorrelated' relaxed clock is the Gamma-Poisson mixture presented by Volz and Frost (2017), while the 'additive' variance model was developed by Didelot & Volz (2019).

Value

A time-scaled tree and estimated molecular clock rate

References

E.M. Volz and Frost, S.D.W. (2017) Scalable relaxed clock phylogenetic dating. Virus Evolution. X. Didelot and Volz, E.M. (2019) Additive uncorrelated relaxed clock models.

Author(s)

Erik M Volz <erik.volz@gmail.com>

See Also

ape::chronos ape::estimate.mu

Examples

1
2
3
4
5
6
7
8
9
## simulate a random tree and sample times for demonstration
# make a random tree:
tre <- ape::rtree(50)
# sample times based on distance from root to tip:
sts <- setNames( ape::node.depth.edgelength( tre )[1:ape::Ntip(tre)], tre$tip.label)
# modify edge length to represent evolutionary distance with rate 1e-3:
tre$edge.length <- tre$edge.length * 1e-3
# treedater: 
td <- dater( tre, sts =sts , s = 1000, clock='strict', omega0=.0015)

emvolz-phylodynamics/treedater-dev documentation built on Jan. 28, 2020, 6:05 p.m.