ksi: KSI
In richfitz/ksi: Phylogeny-based Komogorov-Smirnov Importance Statistic

Description Usage Arguments Value

View source: R/ksi.R

This is the fitting function. The test works by sequentially fitting a series of (possibly nested) nodes and asking “Is the trait distribution for the clade descending from this node different to that of the nodes neighbourhood?”. The neighbourhood is defined as all the species that are in the 'partition' of the parent of the focal node that are

Not descended from that node
Not descended from a node that was identified in a previous iteration (so, when fitting node i, we exclude all species descended from nodes 1..(i-1))

This is basically the same algorithm as MEDUSA uses for fitting diversification rates that vary across a phylogeny.

1 2	ksi(tree, dat, depth=10, test=NULL, verbose=TRUE, multicore=FALSE, multicore.args=list())

tree

An phylogeny, of class phylo (ape's phylogeny format). Branch lengths are not required, and are ignored if present. Node labels are recommended. If any nodes lack labels, labels of the form 'node.xxx' will be created. This happens before removing species that do not have trait data so that these node labels will be the same across different analyses.

`dat`	A named vector of tip states. This can be numeric, a factor (categorical) or logical (`TRUE`/`FALSE`).
`depth`	The number of nested nodes to identify (integer from 1 to the number of nodes in the tree once all taxa that lack state information are removed).
`test`	Force a particular test to be used. Valid values are `"ks"`: do a Kolmogorov-Smirnov test on continuous valued distributions. `"chisq"`: do a Chi-squared test on a contingency table with categorical or binary traits. `"gtest"`: do a g-test on a contingency table with categorical or binary traits. If `NULL` (the default) this will be guessed from `dat`.
`verbose`	Print a (fairly small) amount of progress information as the fits proceed.
`multicore`	Logical: if `TRUE`, uses the `multicore` package to carry out some of the calculations in parallel.
`multicore.args`	Arguments to control the behaviour of `mclapply`, when `multicore=TRUE`. For example `multicore.args=list(mc.cores=4, mc.preschedule=FALSE)` specifies that `mclapply` should use 4 cores, and use its load-balancing algorithm.

The ksi function returns an object of class ksi. This is a list with one element per node, named with the names of the nodes that were fit. Each list element contains the elements:

statistic: Value of the statistic against which nodes are ranked (for the Kolmogorov-Smirnov test, this is D scaled by the relative sample sizes – see ?ks.test).
p.value: The p-value from the test (often meaninglessly small on large trees or uneven partitions).
n: A vector of length 2 with the number of species in the 'target' and 'neighbourhood' of the node, respectively.

In addition, the ksi object (obj, say) has a number of attributes:

attr(obj, "tree"): The phylogeny, after adding node labels and dropping species that have no trait data.
attr(obj, "dat"): The data, altered to have the same contents and order as attr(obj, "tree")$tip.label.
attr(obj, "contents"): A list with one element per node. Each element is a list with elements neighbourhood, target and other, containing the indices of the species that were in each test. These are indices against attr(obj, "dat") for now.
attr(obj, "statistics"): A list with one element poer node. Each element is a vector along the nodes (in ape index) with the statistic for each node for that round. Nodes that were used in previous rounds have a value of NA.
attr(obj, "test"): Indicates which test (ks, chisq, gtest) was done.