phylo.d: Calculates the phylogenetic D statistic
In caper: Comparative Analyses of Phylogenetics and Evolution in R

Description Usage Arguments Details Value Author(s) References Examples

Calculates the D value, a measure of phylogenetic signal in a binary trait, and tests the estimated D value for significant departure from both random association and the clumping expected under a Brownian evolution threshold model.

phylo.d(data, phy, names.col, binvar, permut = 1000, rnd.bias=NULL)
## S3 method for class 'phylo.d'
print(x, ...)
## S3 method for class 'phylo.d'
summary(object, ...)
## S3 method for class 'phylo.d'
plot(x, bw=0.02, ...)

`data`	A 'comparative.data' or 'data.frame' object.
`phy`	An object of class 'phylo', required when data is not a 'comparative.data' object.
`names.col`	A name specifying the column in 'data' that matches rows to tips in 'phy', required when data is not a 'comparative.data' object.
`binvar`	The name of the variable in `data` holding the binary variable of interest.
`permut`	Number of permutations to be used in the randomisation test.
`rnd.bias`	An optional name of a variable in `data` holding probability weights to bias the generation of the random distribution. See 'destails'
`x`	An object of class 'phylo.d'
`object`	An object of class 'phylo.d'
`bw`	The bandwidth to be used for the density plots
`...`	Further arguments to print and summary methods

The sum of changes in estimated nodal values of a binary trait along edges in a phylogeny (D) provides a measure of the phylogenetic signal in that trait (Fritz and Purvis, 2010). If a trait is highly conserved, with only a basal division between two clades expressing either trait value, then the only change will be along the two daughters at the root. This will give a summed value of 1: the two differences between the root nodal value of 0.5 and the ancestors of the 1 and 0 clades. In contrast, if the trait is labile, more differences will be observed and the sum will be higher.

This function calculates the observed D for a binary trait on a tree and compares this to the value of D found using an equal number of simulations under each of two models:

Phylogenetic randomness: Trait values are randomly shuffled relative to the tips of the phylogeny and D is calculated.
Brownian threshold model: A continuous trait is evolved along the phylogeny under a Brownian process and then converted to a binary trait using a threshold that reproduces the relative prevalence of the observed trait.

The value of D depends on phylogeny size - more sister clades yield higher sums - and so the means of the two sets of simulated data are used as calibrations to scale both observed and simulated values of D to set points of 0 (as phylogenetically conserved as expected under a Brownian threshold model) and 1 (random). The value of D can be both smaller than 0 (highly conserved) and greater than 1 (overdispersed) and the distributions of scaled D from the simulations are used to assess the significance of the observed scaled D. The plot method generates density plots of the distributions of the two simulations relative to the observed D value.

rnd.bias is passed to sample as the prob argument to weight the random shuffles of the observed trait. The weights are not checked for validity.

Returns an object of class 'phylo.d', which is a list of the following:

`DEstimate`	The estimated D value
`Pval1`	A p value, giving the result of testing whether D is significantly different from one
`Pval0`	A p value, giving the result of testing whether D is significantly different from zero
`Parameters`	A list of the Observed, MeanRandom and MeanBrownian sums of sister-clade differences
`Permutations`	A list with elements random and brownian, containing the sums of sister-clade differences from random permutations and simulations of Brownian evolution under a threshold model
`NodalVals`	A list with the elements observed, random and brownian, containing the nodal values estimated for the observed trait and permutations. The values are as matrices with rows labelled by the node names in the comparative data object.
`binvar`	The binary variable used
`phyName`	The name of the phylogeny object used
`dsName`	The name of the dataframe used
`nPermut`	The number of permutations used
`rnd.bias`	If a bias was introduced to the calculation of the random distribution, the bias used, else `NULL`

Susanne Fritz <Susanne.Fritz@senckenberg.de> and David Orme

Fritz, S. A. and Purvis, A. (2010). Selectivity in mammalian extinction risk and threat types: a new measure of phylogenetic signal strength in binary traits. Conservation Biology, 24(4):1042-1051.

data(BritishBirds)
BritishBirds <- comparative.data(BritishBirds.tree, BritishBirds.data, binomial)
redPhyloD <- phylo.d(BritishBirds, binvar=Red_list)
print(redPhyloD)
plot(redPhyloD)

Loading required package: ape
Loading required package: MASS
Loading required package: mvtnorm

Calculation of D statistic for the phylogenetic structure of a binary variable

  Data :  BritishBirds.data
  Binary variable :  Red_list
  Counts of states:  0 = 149
                     1 = 32
  Phylogeny :  BritishBirds.tree
  Number of permutations :  1000

Estimated D :  0.5663
Probability of E(D) resulting from no (random) phylogenetic structure :  0.003
Probability of E(D) resulting from Brownian phylogenetic structure    :  0.016