arbutus: Arbutus: Evaluate model adequacy for phylogenetic models of...

View source: R/wrapper-fxns.R

arbutusR Documentation

Arbutus: Evaluate model adequacy for phylogenetic models of continuous character evolution

Description

Arbutus was developed to evaluate the adequacy (or absolute goodness-of-fit) of phylogenetic models of continuous character evolution. The basic principle underlying the approach is that if the generating model is a Brownian motion process, the contrasts (sensu Felsenstein, 1985) will be independent and identically distributed (I.I.D.). We can evaluate this condition by calculating a set of test statistics the contrasts on our observed data, then simulating datasets under a Brownian motion process along the phylogeny and calculating the test statistics on the contrasts of each simulated data set. We can then compare our observed test statistics to the simulated distribution of test statistics.

The I.I.D. property of the contrasts will hold only for the case of Brownian motion. However, the approach can be extended to any arbitrarily complex model of continuous character evolution (as long as the model is based on a multivariate normal distribution) by first rescaling the phylogeny based on the fitted parameter estimates, creating what we refer to as a 'unit.tree'. Once the rescaling is done, the contrasts will be I.I.D. if the model that was fit is a adequate one.

The basic functionality of arbutus can be employed using the function arbutus. This function wraps several other functions which can be used separately:

  1. make_unit_tree: rescales phylogeny based on fitted parameter values

  2. calculate_pic_stat: calculates test statistics on observed data

  3. simulate_char_unit: simulates datasets under Brownian motion

  4. calculate_pic_stat: calculates test statistics on simulated data

  5. compare_pic_stat: compares observed to simulated statistics

Arbutus is designed to interact with a number of other packages for fitting trait models. The architecture is flexible such that additional packages can be incorporated relatively easily.

Use test statistics to assess model adequacy for phylogenetic models of continuous character evolution. This function is a simply a wrapper which does all the steps involved in evaluating model adequacy using the approach outlined in Pennell, FitzJohn, Cornwell and Harmon (in review).

Usage

arbutus(x, nsim = 1000, stats = NULL, ...)

Arguments

x

a fitted model object or a phylo/multiPhylo (see Details).

nsim

the number of datasets to simulate. This is passed as an argument to the function simulate_char_unit.

stats

a named list of test statistics to calculate on observed and simulated datasets. See calculate_pic_stat for details. If nothing is supplied, the function uses the default test statistics (see default_pic_stat

...

additional arguments to be passed to make_unit_tree

Details

arbutus was developed to evaluate the adequacy (or absolute goodness-of-fit) of phylogenetic models of continuous character evolution. The basic principle underlying the approach is that if the generating model is a Brownian motion process, the contrasts (sensu Felsenstein, 1985) will be independent and identically distributed (I.I.D.). We can evaluate this condition by calculating a set of test statistics on the contrasts of our observed data, then simulating datasets under a Brownian motion process along the phylogeny and calculating the test statistics on the contrasts of each simulated data set. We can then compare our observed test statistics to the simulated distribution of summary statistics.

The I.I.D. property of the contrasts will hold only for the case of Brownian motion. However, the approach can be extended to any arbitrarily complex model of continuous character evolution (as long as the model is based on a multivariate normal distribution) by first rescaling the phylogeny based on the fitted parameter estimates, creating what we refer to as a 'unit.tree'. Once the rescaling is done, the contrasts will be I.I.D. if the model that was fit is adequate.

P-values represent the two-tailed probability that the observed summary statistic (or distribution of summary statistics) came from the same distribution as the simulated summary statistics. Low p-values provide evidence that the model is inadequate.

The function arbutus wraps several other functions:

  1. make_unit_tree: rescales phylogeny based on fitted parameter values

  2. calculate_pic_stat: calculates test statistics on observed data

  3. simulate_char_unit: simulates datasets under Brownian motion with rate 1

  4. calculate_pic_stat: calculates test statistics on simulated data

  5. compare_pic_stat: compares observed to simulated test statistics

arbutus interacts with objects produced by fitting evolutionary models using a variety of packages. Currently supported objects are as follows:

  • a gfit object returned from fitting a model of continuous character evolution using fitContinuous in the geiger package.

  • a fit.mle object returned from fitting a model of continuous character evolution using find.mle in the diversitree package.

  • a mcmcsamples object returned from fitting a model of continuous character evolution using MCMC methods in the diversitree package. make_unit_tree will apply the same trait dataset to a set of unit trees based on sampled parameters. By default this will create a unit tree for every sample in the mcmc chain. To modify this, additional arguments can be use. burnin specifies how many samples to remove from the beginning of the chain. thin specifies the thinning interval (e.g. if thin=5, the function will create a unit tree from every fifth parameter set sampled. sample specifies how many samples to draw from the MCMC run.

  • a gls object returned from fitting a phylogenetic generalized least squares model of character correlation using gls in the nlme package.

  • a pgls object returned from fitting a phylogenetic generalized least squares model of character correlation using pgls in the caper package.

  • a phylolm object returned from fitting a phylogenetic generalized linear model of character correlation using phylolm in the phylolm package. As the phylogeny is not returned with the phylolm object, a phy argument must also be specified.

  • a phylo object. If a phylo object is supplied, the tree is assumed to have been rescaled previously. A second argument data must also be provided included the trait data as a named vector with names equal to the tip.labels of the phylogeny.

  • a multiPhylo object. If a multiPhylo object is supplied, the tree is assumed to have been rescaled previously. A second argument data must also be provided included the trait data as a named vector with names equal to the tip.labels of the phylogenies. Note that this function will append the same data set to every tree in the multiPhylo object.

  • a OUwie object returned from fitting a model of continuous character evolution using OUwie in the OUwie package.

References

Matthew W. Pennell, Richard G. FitzJohn, William K. Cornwell and Luke J. Harmon (in review.). Model adequacy and the macroevolution of angiosperm functional traits.

Joseph Felsenstein (1985). Phylogenies and the comparative method. The American Naturalist 125:1-15.

See Also

make_unit_tree, calculate_pic_stat, simulate_char_unit, compare_pic_stat

Examples

## finch data
data(finch)
phy <- finch$phy
data <- finch$data[,"wingL"]

## Not run: 
require(geiger)
## fit Brownian motion model
## using geiger's fitContinuous function
fit.bm <- fitContinuous(phy=phy, dat=data, model="BM",
                                 control=list(niter=10), ncores=1)

## check adequacy of BM model
## get p-values for default summary statistics
modelad.bm <- arbutus(fit.bm, nsim=10)
modelad.bm


## fit Ornstein-Uhlenbeck model
## again, using geiger's fitContinuous function
fit.ou <- suppressWarnings(fitContinuous(phy=phy, dat=data,
               model="OU", control=list(niter=10), ncores=1))

## check adequacy of OU model
modelad.ou <- arbutus(fit.ou, nsim=10)

require(diversitree)
## fit Brownian motion model using ML
## using diversitree's find.mle function

bmlik <- make.bm(phy, data)
fit.bm.dt <- find.mle(bmlik, 0.1)

## this creates a 'fit.mle' object which can be used
## in 'arbutus'
modelad.bm.dt <- arbutus(fit.bm.dt)

## fit Brownian motion model using MCMC
mcmc.bm.dt <- mcmc(bmlik, x.init=1, nsteps=1000, w=1)

## construct a unit tree object from mcmcsamples
## removing 100 samples as burnin
## and sampling 10 parameter sets
modelad.bm.dt.mcmc <- arbutus(mcmc.bm.dt,
                       burnin=100, samples=10)


require(nlme)
## Use pgls to look for a correlation between two traits

t1 <- data
t2 <- td$data[,"tarsusL"]
dd <- cbind.data.frame(t1, t2)

## fit gls model with corPagel correlation structure
fit.gls <- gls(t1~t2, data=dd, correlation=corPagel(phy=phy, value=1))

## this creates a 'gls' object which can be used
## in 'arbutus'
modelad.gls <- arbutus(fit.gls)


## End(Not run)


mwpennell/arbutus documentation built on Oct. 6, 2022, 10 a.m.