knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
evoTS
The evoTS
package facilitates univariate and multivariate analyses of phenotypic change within lineages.
The evoTS
package extends the univariate modeling framework implemented in the paleoTS
package (Hunt 2006; 2008a; 2008b; Hunt et al. 2008; 2010; 2015) and has been developed to mirror the user experience from paleoTS
as much as possible. For example, all univariate models implemented in evoTS
are fitted to a paleoTS
object, i.e., the data format used in paleoTS
. The fit of all univariate models available in paleoTS
and evoTS
are directly comparable.
evoTS
contains a range of multivariate models, including different versions of multivariate unbiased random walks and Ornstein-Uhlenbeck processes. Together, these models allow the user to test various hypotheses of trait evolution, e.g., whether traits change in a correlated or uncorrelated manner, whether one trait/variable affects the optimum of a second trait (Granger causality), whether adaptation in different traits happen independently toward fixed optima etc.
evoTS
also contains functions for calculating the topology of the likelihood surfaces of fitted models, a useful feature to investigate the range of parameter values with approximately equal likelihood as the best parameter estimates.
paleoTS
(technical info)The models implemented in evoTS
build on the same assumptions as the models available in the paleoTS
package: all models assume the population (sample) means in the sequence of ancestor-descendants have a joint distribution that is multivariate normal with an expected mean vector and covariance matrix that are functions of the parameters of each model, the time intervals separating the populations (samples) in the sequence, and the sampling variances of the trait means. Given the assumption of multivariate normality, the expected distribution of sample means is given by their first, second, and mixed moments (covariance). As in the paleoTS
package, evoTS
use the built-in optimization routines in R for estimating maximum likelihood parameter estimates. The default hill-climbing optimization technique used in all univariate models (L-BFGS-B) is a quasi-Newton method that constrains the optimization of certain parameters (e.g., so that variance parameters cannot be smaller than 0). All multivariate models use the Nelder-Mead hill climbing search algorithm as default.
All models in evoTS
have been implemented using the joint parameterization routine from the paleoTS
package. The optimization is therefore fit using the actual sample values, with the autocorrelation among samples accounted for in the log-likelihood function.
In evoTS
, as in the paleoTS
package, the default setting when fitting models is that the sample variance is pooled across all samples/populations. Pooling sample variance makes sense when the sample size is low for some or many of the samples/populations in a time series. Pooling sample variance allows for estimating a more precise sample/population variance common for all samples/populations in stead of relying on each separate estimate of a sample/population variance. There may be circumstances when estimating a pooled sample/population variance is not necessary (e.g., if sample size is high for all samples/populations) or beneficial (the hypothesis being tested demands individually estimated sample variances). Whether sample variances are pooled or not is controlled by setting pool = TRUE
or FALSE
when fitting models in evoTS.
Relative model fit in evoTS
is evaluated based on the small sample-corrected version of the AICc (Akaike 1974; Burnham and Anderson 2002).
The evoTS
package is available on CRAN and GitHub.
## Installing from CRAN > install.packages("evoTS") > library(evoTS)
## Installing from GitHub > install.packages("devtools") > devtools::install_github("klvoje/evoTS") > library(evoTS)
evoTS
An object of class paleoTS
is the required input for most of the functions in evoTS
. To create a paleoTS
object, you need vectors of trait means, sample/population variances, sample sizes and ages of the samples/populations.
One easy way to create a paleoTS
object is to use the as.paleoTS
function from the paleoTS
package.
## Creating example data > trait_means<-rnorm(20) > trait_variance<-rep(0.5,20) > sample_size<-rep(30,20) > time_vector<-seq(0,19,1) # Create paleoTS object > indata.evoTS<-paleoTS::as.paleoTS(mm = trait_means, vv = trait_variance, nn = sample_size, tt = time_vector)
Another way to get data into evoTS
is to use the function read.paleoTS
from the paleoTS
package. This function imports data from a text file with four columns corresponding to sample sizes, trait means, sample/population variances, and ages of the samples/populations (in that order) and converts the input to a paleoTS
object.
See also the paleoTS
(vignette) for more info on how to import data and create a paleoTS
object.
evoTS
Two evolutionary sequences (time-series) of phenotypic change are included in the evoTS
package. The data are from the diatom lineage Stephanodiscus yellowstonensis and were originally published in Theriot et al. (2006). Each trait consists of 63 samples spanning almost 14 000 years of phenotypic change.
We will use these data to illustrate many of the functions implemented in evoTS.
We first investigate phenotypic change in the diameter of S. yellowstonensis. The diameter has been measured in micrometers, but we are interested in investigating proportional changes in the trait. We therefore first do an approximate log-transformation of the data. We then convert the time vector in the data set to unit length (i.e., the length in time from the oldest to youngest sample/population in the data set becomes 1). Such a linear transformation of the time vector does not change how the estimated parameters describe the evolutionary dynamics in the data, but ease parameter estimation and interpretation of certain model parameters. We also plot the data to have a look at the phenotypic changes.
## Doing an approximate log-transformation of the data > ln.diameter<-paleoTS::ln.paleoTS(diameter_S.yellowstonensis) ## Convert the time vector to unit length > ln.diameter$tt<-ln.diameter$tt/(max(ln.diameter$tt)) ## Plotting the data > plotevoTS(ln.diameter)
knitr::include_graphics("diameter.png")
evoTS
The evoTS
package contains a range of univariate models that expand and supplement the models available in paleoTS.
The paleoTS
package contains functions to fit biased (GRW) and unbiased random walks (URW), stasis (modeled as a white noise process, i.e., uncorrelated variation around a constant mean), strict stasis (no real evolutionary change) and an Ornstein-Uhlenbeck (OU) processes assuming a fixed optimum (see Hunt 2006 and Hunt et al. 2008 for info on these models). The paleoTS
package also contains models of a punctuated mode of evolution where punctuations (jumps in phenotype space) separate different parameterizations of the stasis model (Hunt 2008). A few other mode shift models (where the model of evolution shifts at some point during the evolutionary sequence) has also been implemented in paleoTS
(Hunt 2008; Hunt et al. 2015).
The following univariate models have been implemented in the evoTS
package:
A decelerated-evolution model (an unbiased random walk with an exponential decrease in the rate of change over time)
A accelerated-evolution model (an unbiased random walk with an exponential increase in the rate of change over time)
Ornstein-Uhlenbeck processes where the optimum changes according to an unbiased random walk.
The unbiased random walk model in the paleoTS
package model evolutionary changes as random draws from a normal distribution with a mean of zero (Hunt 2006). Each draw from the normal distribution represents a discrete evolutionary "step" and the variance of the normal distribution is called the step variance. The decelerated model of phyletic evolution is an unbiased random walk where the step variance is reduced exponentially through time (Voje 2020). This model is closely related to the early burst model developed for phylogenetic comparative data (e.g., Harmon et al. 2010, Cooper and Purvis 2010), but describes a reduced rate of evolution with time within a lineage and not within a clade. As for the random walk model (Hunt 2006), the expected evolutionary divergence between ancestor and descendant populations is always zero in the model of decelerated evolution. The expected trait mean and its variance and covariance are given by the following expressions:
$$ E[z_{i}] = z_{0} $$
$$ Var[z_{i}] = \sigma ^{2} {step.0}\left[ \frac{ e^{rt_i} - 1}r \right] +\epsilon {i} $$
$$ Cov[z_{i},z_{j}] = \sigma ^{2} {step.0}\left[ \frac{ e^{rt{min}} - 1}r \right] $$
where $z_{i}$ is the expected trait value for population i in the time series, $z_{i}$ is the ancestral trait mean, $\sigma ^{2} {step.0}$ is the step distribution, $r$ describes the exponential decay in the net rate of change through time and is constrained to be 0 or smaller, $t{i}$ is the time interval from the ancestral population mean (the start of the fossil sequence, which has a time of 0) to the ith population, and $t_{min}$ is the time interval from the ancestral population to the oldest of the two populations $z_{i}$ and $z_{j}$.
The decerelated model of evolution can be fitted to a time series using the opt.joint.decel
function.
> opt.joint.decel(ln.diameter) paleoTSfit object [n = 63 , K = 3 ] Model: Decel Method: Joint log-likelihood = 78.67217 AICc = -150.9376 Parameter estimates: anc vstep r 3.7195253 0.4308633 -1.3114667 Additional elements not printed: convergence logLFunction
The output returns the log-likelihood of the model parameters, the AICc score, the number (K) of estimated parameters, the length of the analysed time-series (n), the model name and the method used to parameterize the model (mMthod). anc is the estimated ancestral trait value, vstep is the initial value for the step distribution, and r describes the exponential decay in the vstep parameter through time.
The time it takes to half the net rate of evolution can be calculated based on the value of r using $-ln(2)/r$. The half-life parameter is interpreted based on the time-scale used when analyzing the data. Since time from start to end in our data has been scaled to unit length, the estimated half-life represent the percent of the total length of the time-series it takes for the rate of evolution to half. The half-life is (-ln(2)/-1.3114667 =) 0.53 in this example. The total length of the analyzed time-series is 13,728 years, which means it takes (13,728*0.53 =) 7,276 years for the net rate of evolution to be reduced by 50%.
What are the uncertainty of the estimated parameters in this model? Standard errors of the parameters are returned by setting hess = TRUE
when fitting a model. The standard errors are calculated based on the Hessian matrix, which is a square matrix of partial second order derivatives.
Another way to assess the uncertainty of the estimated parameters is to explore the likelihood-surface of the fitted model.
Investigating the likelihood surface can be helpful for several reasons.
Computing the likelihood surface is a great way to explore which parameter combinations that have an almost identical likelihood compared to the maximum likelihood values. Investigating the log-likelihood surface is also an approach to assess uncertainty in the estimated parameters. A large range of parameter values that have almost the same log-likelihood is an indication that we should be careful putting too much emphasis on only the maximum-likelihood (best) estimates of the parameters. The functions in evoTS
calculating log-likelihood surfaces report the upper and lower parameter estimates that are within two support units of the best estimate as a way to assess uncertainty in parameters (Edwards 1992). While standard errors computed from the Hessian matrix are always symmetric around the point estimate, the log-likelihood surface might not be (multivariate) normal. The reported upper and lower parameter estimates are therefore often not symmetrical around the maximum likelihood parameter estimates.
Estimating parameters in a model using maximum likelihood always run the risk of returning parameters from a local and not a global optimum in the likelihood landscape. Investigating the support surface for combinations of parameters is one way to explore the topology of the likelihood-surface.
Ridges in the log-likelihood surface can make it challenging to identify maximum likelihood estimates of the model parameters in certain cases. Flat ridges may for example cause identifiability issues and problems for the model to converge. Investigating the log-likelihood surface can therefore help diagnose challenges related to failures of models convergence.
The evoTS
package contain functions to create likelihood surfaces for univariate models in evoTS
and paleoTS
(e.g., loglik.surface.stasis
, loglik.surface.URW
, loglik.surface.GRW
, loglik.surface.OU
). These functions need a paleoTS
object and vectors containing candidate values for the parameters to be evaluated. Which candidate values to investigate is trial-and-error, but the maximum likelihood estimate of the parameter should always be in the interval.
For the decelerated model of evolution, the vectors given to the arguments vstep.vec
and r.vec
define the pairwise combinations of parameters for which the function will estimate the log-likelihood. The resolution of the input vectors therefore determines how accurate the visual representation of the support surface is, including the returned upper and lower estimates printed in the console. A higher resolution gives better precision, but demands more computation time. Note that the computed support surface is conditional on the best estimates of the other model parameters that are not part of the support surface (e.g., the estimated ancestral trait value is assumed to be 3.7195253 in the example below).
One way to define the candidate values is to use the seq
function.
> loglik.surface.decel(ln.diameter, vstep.vec = seq(0,1.3,0.01), r.vec = seq(-5,0,0.01)) lower upper vstep 0.18 1.26 r -3.23 -0.21
knitr::include_graphics("decel_logl_surface.png")
From the likelihood surface and from the printed confidence intervals, we see that r values between -3.23 and -0.21 are within 2 log-likelihood units from the best estimate for this parameter. This suggests we should be careful to exclude the possibility that the half-life of the decay in the rate of evolution is as much as 330% (45,312 years) or as low as 21% (2,946 years) of the investigated time-interval.
The accelerated evolution model is identical to the decelerated model except that the r parameter is constrained to be 0 or larger, which means the rate of evolution is accelerating with time.
The accelerated evolution model can be fitted using the opt.joint.accel
function.
> opt.joint.accel(ln.diameter) paleoTSfit object [n = 63 , K = 3 ] Model: Accel Method: Joint log-likelihood = 77.57017 AICc = -148.7336 Parameter estimates: anc vstep r 3.7104896 0.2387427 0.0000010 Additional elements not printed: convergence logLFunction
The accelerated evolution model has a lower (worse) log-likelihood and higher (worse) AICc score compared to the decelerated model of evolution.
A support surface can be produced using the loglik.surface.accel
function.
> loglik.surface.accel(ln.diameter, vstep = seq(0,5,0.01), r.vec = seq(0,1.5, 0.005)) lower upper vstep 0.090 5.00 r 0.035 1.35
knitr::include_graphics("accel_logl_surface.png")
The 3D plot can be rotated vertically and horizontally to get a better overview of the likelihood surface, which is why the observation angle is different for this 3D plot compared to the 3D plot for the decelerated model.
An Ornstein-Uhlenbeck model describes the evolution of a trait towards an optimum. The paleoTS
package includes an Ornstein-Uhlenbeck (OU) model of evolution with a single, fixed optimum (Hunt et al. 2008), portraying evolutionary adaptation of a trait towards a fixed peak on the adaptive landscape. However, peaks in the adaptive landscape might not be fixed and the evoTS
package contains functions to fit OU models where the optimum (peak) is constantly changing position according to an unbiased random walk. Such a model was proposed by Hansen et al. (2008) for analyses of phylogenetic comparative data. Adjusted to describe phenotypic evolution within a single lineage, the expected trait mean and its variance and covariance are given by the following expressions:
$$E[z_{i}] = e^{(-\alpha t_{i})}z_{0} + (1-e^{-\alpha t_{i}})\theta$$
$$Var[z_{i}] =\left[ \frac{ \sigma^{2}{step}+\sigma^{2}{\theta}}{2\alpha} \right] \left( 1-e^{(-2\alpha t_{i})}\right) + \sigma^{2}{\theta}t{i} \left[ 1-2(1-e^{-\alpha t_{i}}) /\alpha t_{i}\right] + \epsilon {i} $$
$$Cov[z{i},z_{j}] =\left[ \frac{ \sigma^{2}{step}+\sigma^{2}{\theta}}{2\alpha} \right] \left( 1-e^{(-2\alpha t_{a})}\right)e^{-\alpha t_{ij}} + \sigma^{2}{\theta}t{a} \left[ 1-\left(1+e^{-\alpha t_{ij}} \right) \left( 1-e^{-\alpha t_{a}} \right) / \alpha t_{i}\right]$$
where $z_{i}$ is the expected trait value for the ith sample, $z_{0}$ is the ancestral trait mean, $t_{i}$ is the time interval from the ancestral population mean (the start of the time-series, which has a time of 0) to the ith sample, $\theta$ is the optimum, $\alpha$ measures the rate of adaptation to the optimum, $\sigma^{2}{step}$ is the variance of the stochastic perturbations of z, and $\sigma^{2}{\theta}$ is the variance of the stochastic perturbations of the optimum, $t_{a}$ is the time interval from the ancestral population to the oldest of the two populations $z_{i}$ and $z_{j}$, and $t_{ij}$ is the time separating two samples $z_{i}$ and $z_{j}$. The estimation (sampling) error $\epsilon_{i}$ of the population means contribute to the expected variance between two population means.
The model can be fitted using the opt.joint.OUBM
function.
> opt.joint.OUBM(ln.diameter) paleoTSfit object [n = 63 , K = 4 ] Model: OU model with moving optimum (ancestral state at optimum) Method: Joint log-likelihood = 78.5667 AICc = -148.4437 Parameter estimates: anc/theta.0 vstep.trait alpha vstep.opt 3.71050957 0.25577371 4.45009756 0.00000001
The vstep.opt parameter describes the rate of change in the optimum. This is extremely small (virtually zero) in the example above, which means the optimum is essentially fixed. The alpha in the OU model represents the strength of the pull towards the optimum (Hansen 1997). A parameter that is easier to interpret compared to the alpha is the half-life, $ln(2)/ \alpha$, which is the time it takes for the trait to move half-way from the ancestral state to the optimum. The half life is therefore a quantification of the speed of adaptation towards the optimal state. As for the decelerated and accelerated models of evolution, the interpretation of the half life depends on the time-interval covered by the time-series. Since the time-interval of the time-series we analyze is scaled to unit length (i.e., the time from the start to the end of the time-series is 1), this means the half-life can be interpreted as the percent of the total length of the time-series. The half-life in our example is $ln(2)/ \alpha$ = 0.16. According to this point estimate, it takes the trait 16% of the total length of the time-series to evolve half-way towards the optimum, which is about (13,728 years * 0.16 =) 2197 years.
Note that the name of the first reported parameter is anc/theta.0. This parameter represents the ancestral trait value, but also the value of the "ancestral" optimum. The default option in the opt.joint.OUBM
function is to assume that the trait was perfectly adapted at the start of the time-series (the argument anc.opt = TRUE
), but this can be changed by setting anc.opt = FALSE
> opt.joint.OUBM(ln.diameter, opt.anc = FALSE) paleoTSfit object [n = 63 , K = 5 ] Model: OU model with moving optimum Method: Joint log-likelihood = 80.71298 AICc = -150.3733 Parameter estimates: anc vstep.trait theta.0 alpha vstep.opt 3.70316688 0.27295686 3.89044533 11.89309009 0.00000001
Setting opt.anc = FALSE
estimates a separate “ancestral” value for the optimum (theta.0). The rate of change in the optimum (vstep.opt
) is still negligible, which means this model is virtually identical to a model where the optimum is fixed. This can be shown by fitting an OU model where the optimum is fixed, which is the model included in the paleoTS
package.
> paleoTS::opt.joint.OU(ln.diameter) paleoTSfit object [n = 63 , K = 4 ] Model: OU Method: Joint Convergence: Successful log-likelihood = 80.71298 AICc = -152.7363 Parameter estimates: anc vstep theta alpha 3.7031741 0.2729583 3.8904507 11.8938651 Additional elements not printed: convergence logLFunction
The fixed optimum model gives the same log-likelihood value as the model where the optimum was allowed to change (but actually didn’t). The fixed optimum model has a better AICc score as this model contains one less parameter (the parameter describing the rate of change in the optimum).
It is good practice to repeat any numerical optimization procedure from different starting points. This is especially important when the model has several parameters, as parameter-rich models may contain more than one peak in the log-likelihood surface. The OUBM model is a type of model that may have several local peaks in the likelihood space.
The user can choose the number of iterations of the numerical optimization of the OUBM model using the argument iterations.
The function will return the parameter values from the run with the highest log-likelihood. The starting values in each iteration are drawn from a normal distribution with mean zero and a standard deviation set by the user (default is 1). The initial values for the vstep and alpha parameters are constrained to be equal or larger than 0.
Here, we run the opt.joint.OUBM
function (assuming the trait value is perfectly adapted at the start of the sequence) from 100 different starting points (i.e., 100 different initial parameter values):
> opt.joint.OUBM(ln.diameter, opt.anc = TRUE, iterations = 100) The optimization method is executed from multiple different starting points. Number of iterations: 100 paleoTSfit object [n = 63 , K = 4 ] Model: OU model with moving optimum (ancestral state at optimum) Method: Joint log-likelihood = 78.5667 AICc = -148.4437 Parameter estimates: anc/theta.0 vstep.trait alpha vstep.opt 3.71050956 0.25577772 4.45011497 0.00000001
From the output, we see that the likelihood score of the best model among the 100 model runs is identical to the score when we ran the model without any iterations. However, the maximum likelihood parameter estimates are slightly different (e.g., a difference in the sixth decimal for the vstep parameter), but not to an extent that changes our interpretation of the trait dynamics.
The evoTS
package contains functions to estimate likelihood surfaces for the different versions of the OU models (loglik.surface.OU
and loglik.surface.OUBM
). In these functions, the likelihood surface is not estimated as a function of the step variance and alpha parameter directly, but rather as a function of two related parameters that are easier to give a biological interpretation. The stationary variance, $vstep/(2 \alpha)$, represents the equilibrium variance of the OU process (Hansen et al.* 2008) and describes the variance expected in the trait after the trait has reached the optimum. The half-life, $log(2)/(\alpha)$, is the amount of time it takes for the trait to move half-way from the ancestral state to the optimum. The half-life is informative regarding the speed of adaptation toward the optimal state. To get an idea for which candidate values to investigate for the likelihood-surface, we first need to calculate the maximum likelihood values of the stationary variance and half-life parameters from the model output.
The OU model with a fixed optimum had the best relative model fit according to AICc among the three versions of the OU model we investigated. The maximum likelihood estimate of the half-life from this OU model is $log(2)/11.8941$ = 0.0583. The maximum likelihood estimate of the stationary variance is $0.2730/(2*11.8941)$ = 0.0115. But these are only point-estimates. We can explore the support interval around these point estimates of the half-life and the stationary variance using the loglik.surface.OU
function.
> loglik.surface.OU(ln.diameter, stat.var.vec=seq(0,0.1,0.001), h.vec=seq(0,0.4,0.001)) lower upper stationary variance 0.007 0.053 half-life 0.029 0.305
knitr::include_graphics("OU_logl_surface.png")
Half-life values up to 30% of the total length of the time-series are within two log-likelihood units from the best estimate. This indicates that substantially slower evolution than the point estimate of a 6% half-life cannot be ruled out as a possibility.
evoTS
and paleoTS
A quick way to evaluate the relative fit of all univariate models in the evoTS
and paleoTS
packages (excluding models with mode shifts) is to use the fit.all.univariate
function.
> fit.all.univariate(ln.diameter, pool = TRUE) Comparing 9 models [n = 63, method = Joint] logL K AICc dAICc Akaike.wt GRW 77.64073 3 -148.87469 3.861618 0.058 URW 77.57018 2 -150.94035 1.795953 0.162 Stasis 39.84019 2 -75.48039 77.255917 0.000 StrictStasis -707.46411 1 1416.99379 1569.730094 0.000 Decel 78.67217 3 -150.93756 1.798746 0.162 Accel 77.57017 3 -148.73357 4.002737 0.054 OU 80.71298 4 -152.73631 0.000000 0.397 OU model with moving optimum (ancestral state at optimum) 78.56670 4 -148.44374 4.292567 0.046 OU model with moving optimum 80.71298 5 -150.37333 2.362978 0.122
There is no a priori reason why a lineage should follow only one mode of evolution. The evoTS
package allows for investigating all pairwise model combinations of the models stasis, unbiased random walk (URW), trend (GRW) and an Ornstein-Uhlenbeck (OU) process with a fixed optimum using the function fit.mode.shift
. It is possible to either investigate specific shift points using the argument shift.point
or investigate all possible shift points, like below.
> fit.mode.shift(ln.diameter, model1 = "URW", model2 = "URW", minb = 10) Comparing 9 models [n = 63, method = Joint] logL K AICc dAICc Akaike.wt GRW 77.64073 3 -148.87469 3.861618 0.058 URW 77.57018 2 -150.94035 1.795953 0.162 Stasis 39.84019 2 -75.48039 77.255917 0.000 StrictStasis -707.46411 1 1416.99379 1569.730094 0.000 Decel 78.67217 3 -150.93756 1.798746 0.162 Accel 77.57017 3 -148.73357 4.002737 0.054 OU 80.71298 4 -152.73631 0.000000 0.397 OU model with moving optimum (ancestral state at optimum) 78.56670 4 -148.44374 4.292567 0.046 OU model with moving optimum 80.71298 5 -150.37333 2.362978 0.122 > fit.mode.shift(ln.diameter, model1 = "URW", model2 = "URW", minb = 10) [1] "Searching all possible shift points in the evolutionary sequence" Total # hypotheses: 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 paleoTSfit object [n = 63 , K = 4 ] Model: URW-URW Method: Joint log-likelihood = 79.27473 AICc = -149.8598 Parameter estimates: anc vstep vstep shift1 3.7304865 0.2432606 0.2494830 52.0000000 Log-likelihoods of all tested shift-points shift all.logl 11 70.86731 12 68.59747 13 62.65156 14 65.41455 15 53.23718 16 60.25449 17 56.19608 18 45.50594 19 42.55632 20 44.70092 21 46.05836 22 46.58797 23 48.14004 24 44.19647 25 67.63474 26 57.82010 27 64.69928 28 69.73926 29 58.45089 30 55.26295 31 53.40897 32 64.95625 33 66.97684 34 66.65108 35 72.98220 36 65.42928 37 71.12637 38 76.39216 * 39 74.96141 * 40 74.07654 41 74.06923 42 77.98640 * 43 75.27489 * 44 73.96364 45 78.18181 * 46 74.92223 * 47 75.10229 * 48 77.07127 * 49 77.28518 * 50 79.12256 * 51 75.99980 * 52 79.27473 ** 53 76.45725 * 54 70.37393 Shift occurs immediately AFTER listed sample number ** = maximum-likelihood shift point * = additional shift points in CI [ within 4.744 logL units; Chi-sq P = 0.95 , df = 4 ] Additional elements not printed: convergence logLFunction
The function fit.mode.shift
can also be used to fit all pairwise combinations of the four models by setting the fit.all
argument as TRUE.
If a shift point is not defined (using the shift.point
argument), all possible shift points are investigated for all models.
> fit.mode.shift(ln.diameter, fit.all = TRUE, minb = 10) [1] "Searching all possible shift points in the evolutionary sequence" 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 Comparing 16 models [n = 63, method = Joint] logL K AICc dAICc Akaike.wt Stasis-Stasis 53.24711 4 -97.80456 64.3140156 0.000 Stasis-URW 72.84306 4 -136.99646 25.1221145 0.000 Stasis-GRW 72.84414 5 -134.63564 27.4829349 0.000 Stasis-OU 74.97027 6 -136.44053 25.6780478 0.000 URW-URW 79.27473 4 -149.85981 12.2587671 0.001 URW-GRW 79.54430 5 -148.03597 14.0826065 0.000 URW-OU 85.34081 6 -157.18163 4.9369506 0.026 GRW-GRW 84.03615 6 -154.57229 7.5462836 0.007 GRW-OU 88.87002 7 -161.70368 0.4148986 0.254 OU-OU 90.26555 8 -161.86443 0.2541469 0.275 OU-GRW 84.14765 7 -152.25894 9.8596410 0.002 OU-URW 83.90670 6 -154.31340 7.8051743 0.006 OU-Stasis 87.80929 6 -162.11858 0.0000000 0.312 GRW-URW 83.79338 5 -156.53412 5.5844543 0.019 GRW-Stasis 86.49922 6 -159.49844 2.6201411 0.084 URW-Stasis 83.36672 5 -155.68080 6.4377793 0.012 [[1]] paleoTSfit object [n = 63 , K = 6 ] Model: OU-Stasis Method: Joint log-likelihood = 87.80929 AICc = -162.1186 Parameter estimates: anc vstep theta_OU alpha omega shift1 3.705695031 0.327387632 3.817864883 5.226471406 0.001870449 38.000000000 Additional elements not printed: convergence logLFunction GG
The function returns a list of the highest log-likelihood found for each investigated model. A detailed output from the model with the lowest AICc value among the 16 candidate models is also given. An OU-Stasis model with a shift point at sample (population) 38 has the best relative fit according to AICc. Note, however, that the model-combination Trend-OU (GRW-OU) has an almost equal AICc score relative to the best model. Also the combination of two OU models (each with their own fixed optimum) shows a good relative fit to the data.
It is possible to simulate data for all implemented models in evoTS
. Standard R-package documentation can be seen by entering ?sim.OUBM
and ?sim.accel.decel
Traits are rarely changing independently of each other due to shared genetics or development. Evolution is accordingly a multivariate phenomenon. The evoTS
package includes functions to fit the following multivariate trait models:
Multivariate unbiased random walks
Multivariate decelerated evolution
Multivariate accelerated evolution
Multivariate Ornstein-Uhlenbeck processes
Evolution as a multivariate unbiased random walk is modeled using an evolutionary rate matrix R (Felsenstein 1988; Revell and Harmon 2008). The diagonal elements in R represent the rate of evolution for the individual traits, while the off-diagonal elements represent the extent to which different traits co-evolve.
The multivariate variance-covariance matrix for the unbiased random walk model (V) is computed using the Kronecker product of the R matrix and a “distance matrix” C, describing the distance in time between the different samples/populations in the time-series.
$$V = \sum_{i=1}^{m} R \otimes C$$
Sampling error of the trait mean (calculated as the sample variance divided by the sample size) is – as for all models in evoTS
– added to the diagonal of the V matrix. To ensure symmetric positive definiteness of the V matrix during log-likelihood optimization, R is parameterized by its Cholesky decomposition as the cross-product of upper triangular matrices. In cases where different parts of the evolutionary time-series are described by $R_{m}$ matrices, V is computed as the sum of the different $R_{m}$ and $C_{m}$ matrices:
$$V = \sum_{i=1}^{m} R_{m} \otimes C_{m}$$
The current implementation of the multivariate unbiased random walk model allows testing six variants of the model. All variants of the model can be fitted using different specifications of R
and r
arguments in the fit.multivariate.URW
function.
There are two options for the structure of the R matrix. Setting R = "diagonal"
means only the diagonal elements of the R matrix will be estimated while off-diagonal elements are set to 0 (see panel a below). This parameterization of the R matrix means the changes in the traits are assumed to be uncorrelated. Setting R = "symmetric"
means all (both diagonal and off-diagonal) elements in the R matrix are estimated (panel b below). This parameterization estimates how changes in the traits are correlated.
knitr::include_graphics("R_matrices.png")
The argument r
in the fit.multivariate.URW
function defines whether the rate of change is assumed constant ("fixed"), asymptotically decreasing ("decel"), or asymptotically increasing ("accel") with time. Defining r
as "fixed" means a regular multivariate unbiased random walk is fitted to the data. The "decel" and "accel" options fit multivariate versions of the decelerated and accelerated versions of the unbiased random walk, respectively. These latter two models deviate from the multivariate unbiased random walk in that the distance matrix C is transformed by an exponential decay or acceleration parameter, r, that is jointly estimated during the maximum likelihood search for the R matrix. A common r parameter is assumed for all the traits.
To use the multivariate models, we first need multivariate data (i.e., at least two traits or variables).
The data set on phenotypic evolution in Stephanodiscus yellowstonensis (Theriot et al. 2006) contains data on the number of ribs in addition to the size of the diameter we have analyzed so far. Before combining the rib and diameter data into a multivariate data set, we first do an approximate log-transformation of the rib data and convert the time vector to unit length.
## Doing an approximate log-transformation of the data > ln.ribs<-paleoTS::ln.paleoTS(ribs_S.yellowstonensis) ## Convert the time vector to unit length > ln.ribs$tt<-ln.ribs$tt/(max(ln.ribs$tt))
We combine the two paleoTS
objects into a multivariate evoTS
object using the make.multivar.evoTS
function to make a multivariate data set ready to be analyzed in evoTS
.
> diam.ln_ribs.ln<-make.multivar.evoTS(ln.diameter, ln.ribs)
We can use the plotevoTS.multivariate
function to have a look at the combined data set.
> plotevoTS.multivariate(diam.ln_ribs.ln, y_min=3.4, y_max=4.8, x.label = "Relative time", pch=c(20,20))
knitr::include_graphics("diam_and_ribs.png")
Eye-balling the data seems to suggest that the traits change in a coordinated fashion.
We first fit a multivariate unbiased random walk model where the off-diagonal elements in the R matrix are zero, and the rate of evolution is assumes constant. This is equivalent as fitting two separate univariate unbiased random walks models to each of the two time-series.
> fit.multivariate.URW(diam.ln_ribs.ln, R = "diag", r = "fixed") [1] "Model converged successfully." $modelName [1] "Multivariate Random walk (R matrix with zero off-diagonal elements)" $logL [1] 136.1905 $AICc [1] -263.6914 $ancestral.values [1] 3.71049 4.00711 $SE.anc [1] NA $R [,1] [,2] [1,] 0.2387433 0.0000000 [2,] 0.0000000 0.4568377 $SE.R [1] NA $method [1] "Joint" $K [1] 4 $n [1] 63 $iter [1] NA attr(,"class") [1] "paleoTSfit"
The returned parameters include the ancestral trait values for the two traits and the evolutionary rate matrix R. The diagonal in the R matrix contains the step size (rate of evolution) parameters. The second trait (ribs) has about twice the rate of evolution as the first parameter (diameter). The off-diagonal elements are zero as this model is not estimating the covariance of the evolutionary changes in the two traits.
Next, we fit a model that allows the off-diagonal elements in the R matrix to be different from zero. We do this by setting R = symmetric
. We are still keeping the rate of change fixed through time.
> fit.multivariate.URW(diam.ln_ribs.ln, R = "symmetric", r = "fixed") $converge [1] "Model converged successfully" $modelName [1] "Multivariate Random walk (R matrix with non-zero off-diagonal elements)" $logL [1] 182.3777 $AICc [1] -353.7027 $ancestral.values [1] 3.717695 4.025925 $SE.anc [1] NA $R [,1] [,2] [1,] 0.2680092 0.3780642 [2,] 0.3780642 0.5524616 $SE.R [1] NA $method [1] "Joint" $K [1] 5 $n [1] 63 $iter [1] NA attr(,"class") [1] "paleoTSfit"
A multivariate random walk with correlated changes has a much better fit compared to the model assuming uncorrelated changes in the traits according to AICc. This indicates that the traits are not evolving independently of each other. The estimated R matrix indicates that the first trait has about half the rate of evolution as the second trait and that there is substantial covariance in the evolutionary changes of the two traits. How the two traits correlate in their changes can be computed by standardizing the covariance with the product of the standard deviations on the diagonal (this can also be done using the function cov2cor
in the stats
package).
> 0.3780642/(sqrt(0.2680092)*sqrt(0.5524616)) [1] 0.9825161 # Or alternatively: > model1<-fit.multivariate.URW(diam.ln_ribs.ln, R = "symmetric", r = "fixed") [1] "Model converged successfully." > stats::cov2cor(model1$R) [,1] [,2] [1,] 1.000000 0.982516 [2,] 0.982516 1.000000
A correlation of 0.98 basically means the two traits evolve as a single trait, since at least part of the deviation from a correlation of 1 is due to measurement error.
Note that the R matrix is not describing the underlying genetic or phenotypic (co)variances of the traits. The R matrix is therefore not the same as a P (or G) matrix in quantitative genetics. However, the R matrix is tightly connected to these matrices. For example, if the traits evolve only due to drift, the R matrix is expected to be proportional to the additive genetic variance–covariance matrix (G) (Lande 1979; Felsenstein 1988). Estimating R can therefore aid in evolutionary interpretations of the fossil record anchored in evolutionary quantitative genetics.
Standard errors of the elements in the R matrix can be approximated by the square root of the diagonal elements of the inverse of the negative of the Hessian matrix. These standard errors are automatically estimated and reported if the argument hess=TRUE
in defined.
Parameterizing multivariate models is demanding in terms of computational time (Felsenstein 1973; Hadfield and Nakagawa 2010; Freckleton 2012). Be aware that fitting the multivariate models to several traits with many samples (populations) will take much longer time than fitting univariate models to each trait separately. One way to somewhat follow the progress of the model fit is to set trace = TRUE
. This allows the user to follow the progress of the optimization routine to minimize the likelihood function.
The likelihood surface of multivariate models can contain several local peaks. It is therefore recommended to rerun the model from different starting points (i.e., different initial parameter values). The number of iterations can be defined by the iterations argument (e.g., iterations = 10
). Initial values for the search algorithm are drawn from a normal distribution with a default standard deviation of 1. The user can set this standard deviation by the iter.sd
argument. (e.g., iter.sd = 0.5
).
We can check if we find evidence for a change in the R matrix along the time-series. This involves estimating a separate R matrix for two non-overlapping parts of the time-series. This is done using the function fit.multivariate.URW.shift
. We can define the shift point using the argument shift.point
(e.g., shift.point = 20
) or we can investigate all possible shift points in the time series by not defining a shift point (the default option). The length of the smallest number of samples (populations) in each of the two segments is controlled by the minb
argument (the default is 10).
> fit.multivariate.URW.shift(diam.ln_ribs.ln, hess = TRUE) Total # hypotheses: 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 $converge [1] "Model converged successfully" $modelName [1] "Multivariate Random walk with two R matrices (with non-zero off-diagonal elements)" $logL [1] 190.7035 $AICc [1] -360.0108 $ancestral.values [1] 3.730242 4.061121 $SE.anc [1] 0.02942139 0.04554309 $R $R$R.1 [,1] [,2] [1,] 0.3712051 0.6509228 [2,] 0.6509228 1.1705596 $R$R.2 [,1] [,2] [1,] 0.2216526 0.2628519 [2,] 0.2628519 0.3164376 $SE.R $SE.R$SE.R.1 [,1] [,2] [1,] 0.01712463 0.02980279 [2,] 0.02980279 0.06689237 $SE.R$SE.R.2 [,1] [,2] [1,] 0.003596406 0.004424130 [2,] 0.004424130 0.006407946 $method [1] "Joint" $K [1] 9 $n [1] 63 $iter [1] NA $parameters shift1 18 $all.logl [1] 186.7757 185.1417 186.0625 189.5071 189.7518 187.6900 187.1944 190.7035 187.8904 187.7113 187.0241 186.5424 185.8391 185.3580 187.9486 185.8228 186.7893 186.8395 [19] 187.8938 186.4932 185.8709 186.7557 186.0485 184.9683 184.9777 184.3315 183.9433 183.9788 182.8478 182.2962 181.5945 181.9293 181.2754 180.8469 181.1119 180.5572 [37] 180.0302 179.8013 179.7917 180.0224 179.8181 179.8010 179.5718 179.7813 $GG [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [1,] 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41] [,42] [,43] [,44] [1,] 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 attr(,"class") [1] "paleoTSfit"
Two R matrices are returned (R.1 and R.2), along with their standard errors (since we used hess = TRUE
). The estimated shift point is 18 (shift1). The rate of evolution is much larger in the second trait compared to the first trait (the diagonal elements in R.1) while the difference in rate of evolution is much smaller after the shift point (the diagonal elements in R.2). The evolutionary correlation is 0.98 and 0.99 before and after the shift point. The number of parameters (K) for this model is 9 compared to 5 for the model where we estimated a single R matrix for the multivariate data set. The model with two R matrices has a lower (better) AICc score compared to the model with a single R matrix. This difference in AICc scores is likely to be a result of the differences in rates of evolution in R.1 and R.2 since the evolutionary correlation is estimated to be very similar in both R.1 and R.2.
We now check if the multivariate accelerated and decelerated models have a better fit than the multivariate random walk models.
> multi.accel<-fit.multivariate.URW(diam.ln_ribs.ln, R = "symmetric", r = "accel") > multi.decel<-fit.multivariate.URW(diam.ln_ribs.ln, R = "symmetric", r = "decel") > multi.accel$AICc;multi.decel$AICc [1] -351.2512 [1] -356.3287
The multivariate decelerated evolution model has a similar (but worse) fit to the data compared to the multivariate unbiased random walk model with a single R matrix, but is out-competed by the model estimating two R matrices.
Multivariate Ornstein-Uhlenbeck models allow for testing a range of different adaptive hypotheses of trait evolution, as detailed below. Adapted to describe evolution of traits within the same lineage, the multivariate Ornstein-Uhlenbeck process is described by the following differential equation (Bartoszek et al. 2012; Reitan et al. 2012; Clavel et al. 2015):
$$dY = A(\theta(t)-Y(t))dt + RdW(t)$$
where A is a square matrix that describes the rate of evolution toward the optimal trait values and with dimensions equal to the number of investigated traits (A is often called the pull matrix), $\theta$ is a vector containing the optimum for each trait, R is a square matrix describing the stochastic changes in the traits and with dimensions equal to the number of investigated traits (R is often called the drift matrix), and $W$ is the diffusion parameter. Under the assumption that the we only have one selective regime (optimum) per trait, the expected trait mean of the Ornstein-Uhlenbeck process is the weighted sum of the optimum and the root value (Hansen 1997):
$$E\left[Z_{i}\right]= e^{(-At_{i})}z_{0} + (1-e^{(-At_{i})})\theta$$
where $Z_{i}$ is a vector containing the expected trait values for sample i, $z_{0}$ is a vector containing the ancestral trait means, $\theta$ is a vector containing the optima, and $t_{i}$ is the time interval from the ancestral population mean (the start of the time-series, which has a time of 0) to the ith population mean.
The variance and covariance of sample/population means are given by the following expression:
$$Cov(z_{i}, z_{j})=\left[P\left( \left[ \frac{1}{\lambda_{k} + \lambda_{l}} \left( 1-e^{-({\lambda_{k} + \lambda_{l})}t_{a}}\right)\right]{1\le kl \le m} \odot P^{-1}\Sigma\Sigma^{T}(P^{-1})^{T}\right) P^{T}\right]e^{-A^{T}t{ij}}$$
where P is the orthogonal matrix of eigenvectors of A, $\Sigma\Sigma^{T}$ is the Cholesky decomposition of the R matrix, $\lambda_{i}$ is the ith eigenvalue of A, $t_{a}$ is the time interval from the ancestral population to the oldest of the two populations $z_{i}$ and $z_{j}$, and $t_{ij}$ is the time separating two samples $z_{i}$ and $z_{j}$, while $\odot$ represents the Hadamard (element-wise) product.
Different ways to parameterize the A matrix allows testing a range of evolutionary hypotheses using the multivariate OU process. While the R matrix in the multivariate version of the unbiased random walk (and in the multivariate OU model) either needs to be diagonal (only elements on the diagonal while off-diagonal elements are zero) or complete (in the sense that both the diagonal and off-diagonal elements are non-zero), this is not the case for the A matrix. Generally, we can divide up the hypotheses being tested using the multivariate OU into four different types:
Independent evolution (only diagonal elements in both the A and the R matrix, i.e., equivalent to fitting univariate models to each trait separately.)
Independent adaptation (only diagonal elements in the A matrix while the R matrix is completely parameterized. In such a model, the traits are adapting independently toward their optima, but the stochastic changes in the traits are correlated.)
At least one trait affects the optimum of the other trait. The diagonal and at least one of the off-diagonal elements in the A matrix is non-zero. It is the trait in the column of the non-zero off-diagonal element that affects the optimum of the trait in the row of the non-zero off-diagonal element. A negative off-diagonal element means the trait evolves toward the optimum determined by the other trait, while a positive off-diagonal element means the trait evolve away from the optimum determined by the other trait. The stochastic changes in the trait can be either correlated or non-correlated.
The same traits affect each others optimum (but to different degrees). The same off-diagonal elements on both sides of the diagonal are parameterized in the A matrix. The stochastic changes in the trait can be either correlated or non-correlated.
For a data set consisting of two traits, there are four possible parameterizations of the A matrix (see panels below). Independent adaptation of both traits (panel a), trait Y affecting the optimum of trait X (panel b), trait X affecting the optimum of trait Y, and the traits X and Y affecting each others optima (panel d). Each of these four ways of parameterizing the A matrix can be combined with an R matrix with elements only on the diagonal (the stochastic changes in the traits are uncorrelated) or a completely parameterized R matrix (the stochastic changes in the traits are correlated).
knitr::include_graphics("A_matrices.png")
An A matrix with off-diagonal elements investigates (Granger) causality between the two traits/variables (Granger 1969; Schweder 1970). Simply speaking, we have evidence for Granger causality if observations in one time series is useful for forecasting observations in one or several other time series, which would be the case if one trait affects the optimum that another trait tracks with a lag. Multivariate OU models therefore enable us to move beyond interpreting correlations among variables. While correlation is a measure of linear dependence between two random variables, the time dimension (i.e., the lag/evolutionary inertia in the tracking of the changing optimum) of the potential relationship between two variables is important in Granger causality. And while a correlation is symmetric (the correlation between X and Y is the same as the correlation between Y and X), this is not necessarily the case for Granger causality. X can (Granger) cause Y, without Y Granger-causing X.
It is also possible to implement a model where the last trait in the data set evolves as an unbiased random walk which affects the optima for all other traits in the data set. This model can be fitted by defining A.matrix ="OUBM"
, which sets the last diagonal element in the A matrix to zero. A zero diagonal element in the A matrix means there is no tendency for this trait to evolve towards an optimal value (which is the case for an unbiased random walk). Setting A.matrix ="OUBM"
will automatically define the R matrix to only have diagonal elements (i.e., this option for the A matrix will override how the R matrix is defined by the user). The reason for this is that the stochastic changes in the variable evolving as an unbiased random walk will affect the optimum of the other traits, which means the stochastic trait dynamics of the traits evolving according to an OU model should be independent of the changes in the optimum.
Multivariate Ornstein-Uhlenbeck models can be fitted using two different functions in evoTS
. The function fit.multivariate.OU
allows the user to use pre-defined arguments for the A and R matrices to parameterize these matrices. The A.matrix
can either be defined as "diag" (panel a above), "upper.tri" (panel b above), "lower.tri" (panel c above), and "full" (panel d above). Default is "diag". The R.matrix
can be defined as "diag" or "symmetric". The function fit.multivariate.OU
allows the user to test all (sensible) hypotheses for multivariate data that consists of two traits.
If the data consists of more than two traits, it is recommended to use the function fit.multivariate.OU.user.defined
. This function lets the user define which elements in the A and R matrices that will be parameterized. This function therefore allows full flexibility for the type of hypotheses that are being tested as the parameterization is not constrained to follow the pre-defined options for the A matrix available in the function fit.multivariate.OU
.
We will investigate whether the two traits from the S. yellowstonensis lineage show evidence of independent evolution (only diagonal elements in the A and R matrices) or whether only the adaptation part of the trait dynamics is independent in the two traits (diagonal A matrix and symmetric R matrix). We will test these hypotheses using the fit.multivariate.OU
function.
Note that the multivariate OU model demands much more computational time compared to univariate models and simple multivariate models (like the multivariate unbiased random walk). The computational time grows exponentially with the dimension of the variance–covariance matrix (e.g., Felsenstein 1973; Hadfield and Nakagawa 2010; Freckleton 2012). The time it takes to fit the multivariate OU models therefore both depends on the number of traits and the length of the time series. Setting trace = TRUE
allows the user to keep an eye on how the optimization proceeds.
> OUOU.model1<-fit.multivariate.OU(diam.ln_ribs.ln, A.matrix="diag", R.matrix="diag") > OUOU.model2<-fit.multivariate.OU(diam.ln_ribs.ln, A.matrix="diag", R.matrix="symmetric") > OUOU.model1$AICc;OUOU.model2$AICc [1] -267.6676 [1] -352.0637 > OUOU.model2 $converge [1] "Model converged successfully" $logL [1] 186.7299 $AICc [1] -352.0637 $ancestral.values [1] 3.718630 4.006437 $SE.anc [1] NA $optima [1] 3.876449 4.282549 $SE.optima [1] NA $A [,1] [,2] [1,] 12.70187 0.00000 [2,] 0.00000 14.60702 $SE.A [1] NA $half.life [1] 0.05457047 0.04745301 $R [,1] [,2] [1,] 0.02233866 0.07823089 [2,] 0.07823089 0.85571347 $SE.R [1] NA $method [1] "Joint" $K [1] 9 $n [1] 63 $iter [1] NA attr(,"class") [1] "paleoTSfit" >stats::cov2cor(OUOU.model2$R) [,1] [,2] [1,] 1.0000000 0.5576264 [2,] 0.5576264 1.0000000
The model with independent adaptation and correlated stochastic changes is much better than the independent evolution model. The half-life for the log diameter and log number of ribs are 6.1% and 4.9% of the length of the time-series, which translates into (13728 * 0.061 =) 837 and (13728 * 0.049 =) 673 years respectively. The correlation of the stochastic changes is substantial, but much reduced compared to the estimate of the correlation from the multivariate unbiased random walk. This is because a substantial part of the trait dynamics in a multivariate OU model is due to the deterministic approach of the traits toward the optima. The model has an almost identical relative fit compared to the multivariate unbiased random walk according to AICc, but is out-competed by the unbiased random walk with a mode shift.
We now test if a more complex parameterization of the A matrix gives us a better relative model fit according to AICc.
> OUOU.model3<-fit.multivariate.OU(diam.ln_ribs.ln, A.matrix="upper.tri", R.matrix="symmetric") > OUOU.model4<-fit.multivariate.OU(diam.ln_ribs.ln, A.matrix="full", R.matrix="symmetric") > OUOU.model5<-fit.multivariate.OU(diam.ln_ribs.ln, A.matrix="OUBM") > OUOU.model3$AICc;OUOU.model4$AICc;OUOU.model5$AICc [1] -350.0442 [1] -380.0126 [1] -305.4217
The best model is a model where each trait affects the optimum of the other trait.
However, before we trust this result, we should make sure to run the multivariate models from different initial starting values to increase our chances of finding a potentially higher peak on the log-likelihood surface. The number of iterations can be defined by the iterations
argument. The starting values are drawn from a normal distribution with a standard deviation of one. The user can define a larger or smaller standard deviation using the argument iter.sd
.
All models possible to investigate using the fit.multivariate.OU
function can also be investigated using the fit.multivariate.OU.user.defined
function as the latter lets the user define which elements in the A and R matrices that are parameterized. Which elements in A and R that should be parameterized or set to zero are given by 1 and 0 respectively. We can for example fit a model with a lower triangle A matrix and a symmetric R matrix. A lower triangle A matrix means the diameter is affecting the optimum of the number of ribs.
> A <- matrix(c(1,0,1,1), nrow=2, byrow=TRUE) > R <- matrix(c(1,1,1,1), nrow=2, byrow=TRUE) > OUOU.model6<-fit.multivariate.OU.user.defined(diam.ln_ribs.ln, A.user=A, R.user=R) > OUOU.model6$AICc [1] -350.146
The fit.multivariate.OU.user.defined
function is especially handy if we have a data set consisting of more than two traits. Let's for example assume that we have a multivariate data set consisting of two phenotypic traits and a climatic variable, which we refer to as variables M, N and O, respectively. Let's assume we have an hypothesis that the climatic variable (O) is changing as a random walk and is affecting the optimum of trait N. We also hypothesize that trait N affects the optimum of trait M, but that trait M is not affecting the optimum of trait N. We assume the stochastic changes in traits M, N are correlated while changes in O are uncorrelated with M and N. This hypothesis is parameterized by defining the A and R matrices the following way.
> A <- matrix(c(1,1,0,0,1,1,0,0,0), nrow=3, byrow=TRUE) > A [,1] [,2] [,3] [1,] 1 1 0 [2,] 0 1 1 [3,] 0 0 0 > R <- matrix(c(1,1,0,1,1,0,0,0,1), nrow=3, byrow=TRUE) > R [,1] [,2] [,3] [1,] 1 1 0 [2,] 1 1 0 [3,] 0 0 1
Note that it is possible to define nonsensical biological hypotheses given the full flexibility on how to parameterize A and R using the fit.multivariate.OU.user.defined
function. For example, an A matrix with only zeros on the diagonal and non-zero off-diagonal elements would parameterize a model where the variables changes according to a random walk (since the diagonal elements are zero), but where all variables are assumed to affect the optima for all variables in the data set (all off-diagonal elements are non-zero). This is a non-sense model as a variable changing according to a random walk has no tendency (or ability) to evolve towards an optimum.
All model fitting procedures to find maximum likelihood solutions need initial starting values for the model parameters. Most of the starting values when fitting multivariate OU models are based on maximum likelihood parameter estimates of the univariate versions of the OU model fitted to each trait separately. The starting values for the off-diagonal elements in the A and R matrices are set to 0 and 0.5, respectively. This way to set the initial parameter values seems to work fine for most data sets, but there is no guarantee that the provided initial parameters will always work as this depends on the nature of the data to be analyzed. If an error message is returned saying "function cannot be evaluated at initial parameters", it is recommended that the user sets the iteration option to 1 (iteration = 1
). The model fitting algorithm will then produce a set of (random) initial starting values and retry to start the optimization procedure 100,000 times.
Another option is to define which initial parameter values the optimization procedure should start from. These values can be provided by setting the user.init...
(e.g., user.init.off.diag.R = 1
) to something else than NULL (which is the default setting). Note that the length of the vector of user-specified initial values need to match the number of parameters to be estimated (e.g., user.init.off.diag.R = 1
for a multivariate data set consisting of two traits, user.init.off.diag.R = c(1,1,1)
if the data set contains three traits, etc.).
Akaike. H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19:716 - 723
Bartoszek, K., J. Pienaar, P. Mostad, S. Andersson, and T. F. Hansen. 2012 A phylogenetic comparative method for studying multivariate adaptation. Journal of Theoretical Biology 314:204–215.
Burnham, K. P. and D. R. Anderson. 2002. Model Selection and multimodel inference: a practical information-theoretic approach (2nd ed.)- Springer-Verlag.
Clavel, J., G. Escarguel, and G. Merceron. 2015. mvmorph: an r package for fitting multivariate evolutionary models to morphometric data. Methods in Ecology and Evolution 6:1311–1319.
Cooper, N., and A. Purvis, 2010. Body size evolution in mammals: complexity in tempo and mode. The American Naturalist 175:727–738.
Edwards, A. W. F. 1992. Likelihood. expanded edition Johns Hopkins University Press. Baltimore, MD.
Felsenstein, J. 1973. Maximum-likelihood estimation of evolutionary trees from continuous characters. The American Journal of Human Genetics 25:471–492.
Felsenstein, J. 1988. Phylogenies and quantitative characters. Annual Review of Ecology, Evolution, and Systematics 19:445–471.
Freckleton R. P. 2012. Fast likelihood calculations for comparative analyses. Methods in Ecology and Evolution 3:940-947.
Granger, C. W. J. 1969. Investigating causal relations by econometric models and cross‐spectral methods. Econometrica 37:424-438.
Hadfield, J. D., and S. Nakagawa. 2010. General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters. Journal of Evolutionary Biology 23:494-508.
Hansen, T. F. 1997. Stabilizing selection and the comparative analysis of adaptation. Evolution 51:1341–1351.
Hansen, T. F., J. Pienaar, and S. H. Orzack. 2008. A comparative method for studying adaptation to a randomly evolving environment. Evolution 62:1965–1977.
Harmon, L. J. et al. 2010. Early bursts of body size and shape evolution are rare in comparative data. Evolution 64:2385–2396.
Hunt, G. 2006. Fitting and comparing models of phyletic evolution: random walks and beyond. Paleobiology 32:578–601.
Hunt, G. 2008a. Gradual or pulsed evolution: when should punctuational explanations be preferred? Paleobiology 34:360–377.
Hunt, G. 2008b. Evolutionary patterns within fossil lineages: model-based assessment of modes, rates, punctuations and process. In R.K. Bambach and P.H. Kelley, eds. From Evolution to Geobiology: Research Questions Driving Paleontology at the Start of a New Century:578–601.
Hunt, G., M. Bell & M. Travis. 2008. Evolution towards a new adaptive optimum: phenotypic evolution in a fossil stickleback lineage. Evolution 62:700-710.
Hunt, G., S. Wicaksono, J. E. Brown, and G. K. Macleod. 2010. Climate-driven body size trends in the ostracod fauna of the deep Indian Ocean. Palaeontology 53:1255-1268.
Hunt, G., M. J. Hopkins, and S. Lidgard. 2015 Simple versus complex models of trait evolution and stasis as a response to environmental change. PNAS 112:4885–4890.
Lande, R. 1979. Quantitative genetic analysis of multivariate evolution, applied to brain: body size allometry. Evolution 33:402-416.
Reitan, T., T. Schweder, and J. Henderiks. 2012. Phenotypic evolution studied by layered stochastic differential equations. The Annals of Applied Statistics. 6:1531–1551.
Revell, L. J., and L. Harmon 2008. Testing quantitative genetic hypotheses about the evolutionary rate matrix for continuous characters. Evolutionary Ecology Research 10:311–331.
Schweder, T. 1970. Composable Markov processes. Journal of Applied Probability 7:400–410.
Theriot, E. C., S. C., Fritz, C. Whitlock, C. and D. J. Conley. 2006. Late Quaternary rapid morphological evolution of an endemic diatom in Yellowstone Lake, Wyoming. Paleobiology 32:38–54.
Voje, K. L. 2020. Testing eco‐evolutionary predictions using fossil data: Phyletic evolution following ecological opportunity. Evolution 74:188–200.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.