There are two steps necessary for summarizing tree samples. One is to decide how many distinct topologies (disregarding branch lengths) are in the collection of trees, and then we need to summarize branch lengths for each of these tree subsets using any sensible summary measure (e.g., mean or median). The functions topoFreq
and summaryBrlen
aid in these two tasks respectively. topoFreq
uses the Robinson-Foulds (RF) distance among trees in order to find clusters which are identical in topology and then bind them together in a list that we can later operate over. The RF distance is defined over unrooted trees, so it is necessary to check whether the input trees are. Unrooting a rooted tree will destroy the root node and then add up together both adjacent branch lengths, and this will be reflected when summarizing withsummaryBrlens
using the output of topoFreq
. This will return one branch length less than the number present in the original sample of rooted trees. Accordingly, this method will only work on collections of unrooted trees, or collections of time trees of fixed topology where we are only interested in finding a summary values for branch lengths. It should not be used when both the topology and divergence times are being coestimated (e.g., when the posterior of trees comes from not fixing the topology as in Beast2
, RevBayes
, and MrBayes
), but it is perfectly fine when the inference was done to estimate only the unrooted topology and branch lengths, such as in standard tree inference in MrBayes
.
We see an example using these two functions below where we want to use the median branch length for each group:
# load the packages library(ape) library(tbea) # these urls render correctly on html but not on pdf. The former is more important for CRAN. # download the trees from the supplementary material of Ballen et al. (2022) t1<-"https://raw.githubusercontent.com/gaballench/cynodontidaeWare/refs/heads/main/bayesian/mrbayes" t2<-"https://raw.githubusercontent.com/gaballench/cynodontidaeWare/refs/heads/main/bayesian/mrbayes" mtr1 <- read.nexus(paste(t1, "concatenatedMolmorph.nexus.run1.t", sep="/")) mtr2 <- read.nexus(paste(t2, "concatenatedMolmorph.nexus.run2.t", sep="/")) trees <- c(mtr1, mtr2) # calculate topological frequencies tpf <- topoFreq(trees, output="trees") # summarize median branch length sumtrees <- summaryBrlen(tpf$trees, method="median") # sort the frequencies in decreasing order decreasingIdx <- order(tpf$fabs, decreasing=TRUE) # how many topologies comprise about 90% of the trees? sum(cumsum(tpf$fabs[decreasingIdx])/sum(tpf$fabs)<0.9) # plot these four trees with branch lengths already summarized par(mfrow=c(2,2)) par(oma=c(0, 0, 0, 0)) par(mai=c(0.5, 0.2, 0.5, 0.2)) for(i in 1:4){ plot(sumtrees[[decreasingIdx[i]]], type="unrooted", show.node.label=FALSE, cex=0.4, main=round(tpf$frel[decreasingIdx[i]], digits=2)) }
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.