PhyloDistance-JRFDist | R Documentation |
Calculate JRF distance between two unrooted phylogenies.
This function is called as part of PhyloDistance
and calculates the Jaccard-Robinson-Foulds distance between two unrooted phylogenies. Each dendrogram is first pruned to only internal branches implying a partition in the shared leaf set; trivial partitions (where one leaf set contains 1 or 0 leaves) are ignored.
The total score is calculated by pairing branches and scoring their similarity. For a set of two branches A, B
that partition the leaves into (A_1, A_2)
and (B_1, B_2)
(resp.), the distance between the branches is calculated as:
2 - 2\left(\frac{|X \cap Y|}{| X\cup Y|}\right)^k
where X \in (A_1, A_2),\; Y \in (B_1, B_2)
are chosen to maximize the score of the pairing, and k
the value of ExpVal
. The sum of these scores for all branches produces the overall distance between the two trees, which is then normalized by the number of branches in each tree.
There are a few special cases to this distance. If ExpVal=1
, the distance is equivalent to the metric introduced in Nye et al. (2006). As ExpVal
approaches infinity, the value becomes close to the (non-Generalized) Robinson Foulds Distance.
Returns a normalized distance, with 0 indicating identical trees and 1 indicating maximal difference.
If RawScore=TRUE
, returns a named length 3 vector with the first entry the summed distance score over the branch pairings, and the subsequent entries the number of partitions for each tree.
If the trees have no leaves in common, the function will return 1
if
RawScore=FALSE
, and c(0, NA, NA)
if TRUE
.
Note that this function requires the input dendrograms to be labeled alike (ex.
leaf labeled abc
in dend1
represents the same species as
leaf labeled abc
in dend2
).
Labels can easily be modified using dendrapply
.
Aidan Lakshman ahl27@pitt.edu
Nye, T. M. W., Liò, P., & Gilks, W. R. A novel algorithm and web-based tool for comparing two alternative phylogenetic trees. Bioinformatics, 2006. 22(1): 117–119.
Böcker, S., Canzar, S., & Klau, G. W.. The generalized Robinson-Foulds metric. Algorithms in Bioinformatics, 2013. 8126: 156–169.
# making some toy dendrograms
set.seed(123)
dm1 <- as.dist(matrix(runif(64, 0.5, 5), ncol=8))
dm2 <- as.dist(matrix(runif(64, 0.5, 5), ncol=8))
tree1 <- as.dendrogram(hclust(dm1))
tree2 <- as.dendrogram(hclust(dm2))
# Nye Metric
PhyloDistance(tree1, tree2, Method="JRF", JRFExp=1)
# Jaccard-RobinsonFoulds
PhyloDistance(tree1, tree2, Method="JRF", JRFExp=2)
# Good approximation to RF Dist (note RFDist is much faster for this)
PhyloDistance(tree1, tree2, Method="JRF", JRFExp=1000)
PhyloDistance(tree1, tree2, Method="RF")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.