get_cherryDist: Get Cherry Pair Distances from a Phylogenetic Tree

View source: R/summaryStatistics.R

get_cherryDistR Documentation

Get Cherry Pair Distances from a Phylogenetic Tree

Description

This function computes the pairwise distances between the tips of a phylogenetic tree that are part of cherries. A cherry is a pair of leaf nodes (also called tips or terminal nodes) in a phylogenetic tree that share a direct common ancestor. In other words, if two leaves are connected to the same internal node and no other leaves are connected to that internal node, they form a cherry. The distance is calculated as the sum of the branch lengths between the two cherry tips.

Usage

get_cherryDist(tree, input_control = TRUE)

Arguments

tree

A tree in Newick format (as a character string) or an object of class phylo from the ape package. If the input is a character string, it must follow the Newick or New Hampshire format (e.g. "((tip_1:1,tip_2:1):5,tip_3:6);"). If an object of class phylo is provided, it should represent a valid phylogenetic tree.

input_control

A logical value indicating whether to validate the input tree. If TRUE (default), the function checks that the tree is in a valid format and has at least two tips. If FALSE, the function assumes the tree is already valid and skips the validation step.

Details

The function first checks if the input is either a character string in the Newick format or an object of class phylo, unless input_control is set to FALSE. It then computes the pairwise distances between the tips in the tree and identifies the sister pairs (cherries). The distance between each cherry is the sum of the branch lengths leading to the sister tips.

The tips of each cherry are identified by their names and indices. The tip indices correspond to (a) the index from left to right on the Newick string, (b) the order of the tip label in the phylo_object$tip.label, and (c) the index in the methylation data list (data[[tip]][[structure]]) as obtained with the function simulate_evolData() when the given tree has several tips.

If the tree is provided in Newick format, it will be parsed using the ape::read.tree function.

Value

A data frame with five columns:

first_tip_name

A character string representing the name of the first tip in the cherry.

second_tip_name

A character string representing the name of the second tip in the cherry.

first_tip_index

An integer representing the index of the first tip in the cherry.

second_tip_index

An integer representing the index of the second tip in the cherry.

dist

A numeric value representing the sum of the branch lengths between the two tips (i.e., the distance between the cherries).

Examples

# Example of a tree in Newick format

newick_tree <- "((a:1,b:2):5,c:6);"

get_cherryDist(newick_tree)

# Example of using a phylo object from ape

library(ape)
tree_phylo <- read.tree(text = "((a:1,b:1):5,c:6);")

get_cherryDist(tree_phylo)


MethEvolSIM documentation built on April 12, 2025, 1:30 a.m.