First, we load the required libraries:
## cleaning rm(list = ls()) ## libraries ## installing and loading the package ##install.packages("../../blepd_0.1.1.tar.gz", repos = NULL, type="source") library(blepd) packageVersion("blepd") ## to plot trees library(ggtree) library(gridExtra) library(RColorBrewer)
Now, we load the data: tree and distributions
## A binary tree with four terminals, newick format. data(tree) data(distribution) initialTree <- tree str(initialTree) ## distributions for four taxa and two area, table format. dist4taxa <- distribution
Plotting:
## A. the tree plotTree <- ggtree(initialTree, ladderize=TRUE, color="black", size=1 , linetype="dotted") + geom_tiplab(size=6, color="black") + theme_tree2() + labs(title = "A. Four terminals, equal branch length") print(plotTree) ## B. the distribution ## distribution to XY distXY <- matrix2XY(dist4taxa) plotDistrib <- ggplot(data=distXY, aes(x= Terminal, y= Area), size =11) + geom_point() + labs(title = "B. Terminals and Distributions", y = "Area", x = "Terminal") plotDistrib
Now we check whether names in both objects, trees and distributions are the same:
all(colnames(dist4taxa) == initialTree$tip.label)
We report all branch lengths and calculate the PD values.
initialTree$edge.length initialPD <- PDindex(tree=initialTree, distribution = dist4taxa) initialPD
To test the effect of changing a single terminal branch length, we will:
bestInitialArea <- row.names(dist4taxa)[which(initialPD == max(initialPD))] bestInitialArea tipToEval <- "t1" value <- 1.1 numberTipToEval <- which(initialTree$tip.label %in% tipToEval) newTree <- initialTree ## a table, binding tree$edge and tree$edge.length createTable <- function(tree = tree){allDataTable <- tree$edge allDataTable <- cbind (allDataTable, tree$edge.length) return(allDataTable) } ## new tree, with t1 branch changed to "value" newTree$edge.length[which(createTable(initialTree)[,2] %in% numberTipToEval)] <- value newTree$edge.length ## plotting plotNewTree <- ggtree(newTree) + theme_tree2() + geom_tiplab(size=6, color="black") + labs(title = "C. Four terminals, non equal branch length") print(plotNewTree) ### PD for the modified tree modifiedPD <- PDindex(tree = newTree, distribution = dist4taxa) bestModifiedArea <- row.names(dist4taxa)[which(modifiedPD == max(modifiedPD))] ## comparing initial and modified areas selected bestInitialArea bestModifiedArea bestInitialArea == bestModifiedArea
As shown, a modification in the branch length will impact the area selected, we can consider that the results are very sensible to changes in branch length.
To evaluate the behaviour of a single terminal branch length we must:
We can turn these steps into a function called evalTerminal, with four parameters:
evalTerminal(tree = initialTree, distribution = dist4taxa, tipToEval = "t1", approach = "lower" )
The lower limit when we change the branch length for terminal t1 is 0.99, as any change in branch length will modify the area selected from A1A2 to A2, as the tie between the paths terminals t1/t3 (area A1) vs t2/t4 (area A2) will be solved in favor of t2/t4 when A1 is shorter.
evalTerminal(tree = initialTree, distribution = dist4taxa, tipToEval = "t2", approach = "lower" )
A similar result will arrive from changing the terminal branch t2, but in this case the tie is solved to favour A1.
evalTerminal(tree = initialTree, distribution = dist4taxa, tipToEval = "t1", approach = "upper" )
And, we will get the same result by changing the branch length of the terminal t1 from 1 to 1.x, to find the upper limit. The tie is solved in favour of area A1, opposite to the solution when we found the lower limit. In this case, even the smaller change in any terminal branch value, will modify the results.
We test the effect of the branch length for all terminals.
newTree modifiedPD bestModifiedArea evalTerminal(tree = newTree, distribution = dist4taxa, tipToEval = "t3", approach = "upper" ) evalTerminal(tree = newTree, distribution = dist4taxa, tipToEval = "t3", approach = "lower" )
The THING is the PD difference between areas, and whether this value could be accumulated in a single terminal branch, or if the contribution to the PD value is evenly distributed among all terminal branches, and therefore to change the PD value more than one terminal branch length must have to change, to select another area.
The function to test all terminals at the same time is evalTree, with two parameters: the tree and the distribution. The function returns a data.frame object with 14 fields: labelTerminal, lowerBranchLength, InitialArea, lowerFinalArea, initialLength, upperBranchLength, upperFinalArea, changeLower, changeUpper, deltaUpper, deltaLower, deltaPD, areaDelta, and abDelta.
evalTree(tree = initialTree, distribution = dist4taxa)
The extreme sensitivity of the PD results to the terminal branch length is seen in the column absolute length difference (=abDelta), as any length change -larger than 0-, will change the area selected.
We could also swap branch lengths and evaluate the behaviour before and after swapping. This swapping could be done by swapping two terminal branch lengths simpleswap, all terminal branch lengths allswap or replacing the terminal branch length with randomly distributed values, using an uniform distribution -uniform-, where the min and max values are obtained from the actual branch lengths.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.