In richfitz/ksi: Phylogeny-based Komogorov-Smirnov Importance Statistic

knitr::opts_chunk$set(echo = TRUE)

Finding Distinctive Clades with the Komogorov-Smirnov Importance Statistic

This is the package that we used in Cornwell et al 2014. The idea is to find clades that are a mix of large and unusual with respect to trait values. The weighting of largeness and unusualness is done via the classic Komogorov-Smirnov test statistic. All nodes on the tree are tested for "distinctiveness". This can be done iteratively as shown below.

The package can be installed using devtools, which itself can be installed from CRAN with

# install.packages("devtools") # uncomment this line if you don't have devtools installed
library(devtools)
install_github("traitecoevo/ksi")
library(ksi)

and the package will be installed.

There is really only one useful function in the package: ksi. See the help page ?ksi for more information once installed and loaded.

The key thing to know is that node labels are essential for this process, so they should be as informative as possible in your input tree. Depth is the number of distinctive clades that the algorithm returns.

library(ape)
library(knitr)
tree <- rtree(1000)
tree$node.label <- paste0("nd", seq_len(tree$Nnode))
vals <- setNames(runif(1000), tree$tip.label)
output <- ksi(tree, vals, depth = 2,verbose = FALSE)
kable(summary(output))

This output gives the two most distinctive clades for out made-up phylogeny. Because of the nature of phylogenies, this returns the node from which the most distinctive single clade descendended, but also includes neighboring nodes which are very difficult to distinguish statistically.

There is more statistical detail inside the returned object, as well as the data split in a useful way for easy visualization: