View source: R/tree_distance_kendall-colijn.R
KendallColijn | R Documentation |
Calculate the Kendall–Colijn tree distance, a measure related to the path difference.
KendallColijn(tree1, tree2 = NULL, Vector = KCVector)
KCVector(tree)
PathVector(tree)
SplitVector(tree)
KCDiameter(tree)
tree1 , tree2 |
Trees of class |
Vector |
Function converting a tree to a numeric vector.
|
tree |
A tree of class |
The Kendall–Colijn distance works by measuring, for each pair of leaves, the distance from the most recent common ancestor of those leaves and the root node. For a given tree, this produces a vector of values recording the distance-from-the-root of each most recent common ancestor of each pair of leaves.
Two trees are compared by taking the Euclidean distance between the respective vectors. This is calculated by taking the square root of the sum of the squares of the differences between the vectors.
An analogous distance can be created from any vector representation of a tree. The split size vector metric \insertCiteSmithSpaceTreeDist is an attempt to mimic the Kendall Colijn metric in situations where the position of the root should not be afforded special significance; and the path distance \insertCiteSteel1993TreeDist is a familiar alternative whose underlying vector measures the distance of the last common ancestor of each pair of leaves from the leaves themselves, i.e. the length of the path from one leaf to another.
None of these vector-based methods performs as well as other tree distances in measuring similarities in the relationships implied by a pair of trees \insertCiteSmithDistTreeDist; in particular, the Kendall Colijn metric is strongly influenced by tree balance, and may not be appropriate for a suite of common applications \insertCiteSmithSpaceTreeDist.
KendallColijn()
returns an array of numerics providing the
distances between each pair of trees in tree1
and tree2
,
or splits1
and splits2
.
KCDiameter()
returns the value of the Kendall & Colijn's (2016)
metric distance between two pectinate trees with n leaves ordered in
the opposite direction, which I suggest (without any attempt at a proof) may
be a useful proxy for the diameter (i.e. maximum value) of the K–C
metric.
KCVector()
: Creates a vector that characterises a rooted tree,
as described in \insertCiteKendall2016;textualTreeDist.
PathVector()
: Creates a vector reporting the number of edges
between each pair of leaves, per the path metric of
\insertCiteSteel1993;textualTreeDist.
SplitVector()
: Creates a vector reporting the smallest split
containing each pair of leaves, per the metric proposed in
\insertCiteSmithSpace;textualTreeDist.
Martin R. Smith (martin.smith@durham.ac.uk)
treespace::treeDist
is a more sophisticated, if more cumbersome, implementation that supports
lambda > 0, i.e. use of edge lengths in tree comparison.
Other tree distances:
JaccardRobinsonFoulds()
,
MASTSize()
,
MatchingSplitDistance()
,
NNIDist()
,
NyeSimilarity()
,
PathDist()
,
Robinson-Foulds
,
SPRDist()
,
TreeDistance()
KendallColijn(TreeTools::BalancedTree(8), TreeTools::PectinateTree(8))
set.seed(0)
KendallColijn(TreeTools::BalancedTree(8), lapply(rep(8, 3), ape::rtree))
KendallColijn(lapply(rep(8, 4), ape::rtree))
KendallColijn(lapply(rep(8, 4), ape::rtree), Vector = SplitVector)
# Notice that changing tree shape close to the root results in much
# larger differences
tree1 <- ape::read.tree(text = "(a, (b, (c, (d, (e, (f, (g, h)))))));")
tree2 <- ape::read.tree(text = "(a, ((b, c), (d, (e, (f, (g, h))))));")
tree3 <- ape::read.tree(text = "(a, (b, (c, (d, (e, ((f, g), h))))));")
trees <- c(tree1, tree2, tree3)
KendallColijn(trees)
KendallColijn(trees, Vector = SplitVector)
KCDiameter(4)
KCDiameter(trees)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.