fit_punc_model: Detect punctuated evolution
In suryakevin/drugcandy: Detecting Punctuated Evolution in Dinosaurs and Viruses

fit_punc_model

R Documentation

Detect punctuated evolution

Description

This function fits the regression model(s) for detecting the punctuational effect at branching events.

Usage

fit_punc_model(data, vcv_parts, D, model = c("p", "pn", "pt", "ptn"))

Arguments

`data`	A data frame with path length in the 1st column, node count in the 2nd, and time in the 3rd (optional)
`vcv_parts`	A list outputted from the `break_vcv` function
`D`	A normalizing matrix D outputted from the `create_dmat` function
`model`	Model options: "p": path ~ 1 "pn": path ~ node "pt": path ~ time "ptn": path ~ time + node

Details

The goal is to estimate the expected difference in net evolution, morphological or molecular, between two taxa where one of them has undergone one additional branching event. A taxon's net divergence is proxied by its phylogenetic path length or root-to-tip distance. And the cumulative number of branching events along the path is the node count.

This expected difference should account for the strict clock, morphological or molecular. When all taxa in the tree are co-occurring (e.g., all present-day mammal species), the expected path length under the strict clock is the mean (path ~ 1). Note that the average should be phylogenetically-normalized. This mean-only model is the simplest.

Punctuated evolution means that more change accumulates during branching events. As a result, the evolutionary rate is not clock-like. So, we hypothesize that node count predicts deviation from the clock (path ~ node).

However, the strict clock has a different expectation when the taxa in the tree are not co-occurring. For example, the taxa are sampled serially like SARS-CoV-2 genomes during the COVID-19 pandemic or sporadically like the deposition of dinosaur fossils in ancient sediments. In these cases, the strict clock expects the amount of divergence to scale proportionally with time (path ~ sampling time).

In such cases, the detection of punctuated evolution becomes the regression of path length on sampling time and node count (path ~ time + node).

We can fit more complex models where we allow the degree of punctuational effect to vary with time (e.g., path ~ time + node + time*node) or space (e.g., path ~ time + node * continent). However, it is best to create your custom function or write a custom script at this point.

Value

This function returns a list containing the fitted model (an object of class gls), estimated variance, phylogenetically-normalized residuals, and sum of squared errors (SSE).