distanceDistributions: Distances between random pairs of trees

distanceDistributionsR Documentation

Distances between random pairs of trees

Description

⁠distanceDistribution25(/50)⁠ are two-dimensional matrices listing the normalized distances between random pairs of bifurcating trees with 25 and 50 leaves drawn from the uniform distribution using TreeTools::RandomTree() (data object randomTreePairs25⁠(/50)⁠). pectinateDistances11 reports distances between a pectinate 11-leaf tree and 100 000 random binary trees.

Usage

distanceDistribution25

distanceDistribution50

pectinateDistances11

Format

Objects of class matrix (inherits from array) with 25 rows, each corresponding to a tree distance method and is named with its abbreviation (listed in 'Methods tested' below), and 10 000 (distanceDistribution25/50) or 100 000 (pectinateDistances11) columns, listing the calculated distances between each pair of trees.

An object of class matrix (inherits from array) with 25 rows and 10000 columns.

An object of class matrix (inherits from array) with 25 rows and 10000 columns.

An object of class matrix (inherits from array) with 24 rows and 100000 columns.

Methods tested

  • pid: Phylogenetic Information Distance (Smith 2020), normalized against the phylogenetic information content of the splits in the trees being compared.

  • msid: Matching Split Information Distance (Smith 2020), normalized against the phylogenetic information content of the splits in the trees being compared.

  • cid: Clustering Information Distance (Smith 2020), normalized against the entropy of the splits in the trees being compared.

  • qd: Quartet divergence (Smith 2019), normalized against its maximum possible value for n-leaf trees.

  • nye: Nye et al. tree distance (Nye et al. 2006), normalized against the total number of splits in the trees being compared.

  • jnc2, jnc4: Jaccard-Robinson-Foulds distances with k = 2, 4, conflicting pairings prohibited ('no-conflict'), normalized against the total number of splits in the trees being compared.

  • jco2, jco4: Jaccard-Robinson-Foulds distances with k = 2, 4, conflicting pairings permitted ('conflict-ok'), normalized against the total number of splits in the trees being compared.

  • ms: Matching Split Distance (Bogdanowicz & Giaro 2012), unnormalized.

  • mast: Size of Maximum Agreement Subtree (Valiente 2009), unnormalized.

  • masti: Information content of Maximum Agreement Subtree, unnormalized.

  • nni_l, nni_L, nni_t, nni_U, nni_u: Lower, best lower, tight upper, best upper, and upper bounds for nearest-neighbour interchange distance (Li et al. 1996), unnormalized. 'Best' lower bounds jump sharply when mismatched regions of a tree become large enough that a tight upper bound cannot be exactly calculated, so are discontinuous and cannot readily be compared between trees.

  • spr: Approximate subtree prune and regraft SPR distance, unnormalized.

  • tbr_l, tbr_u: Lower and upper bound for tree bisection and reconnection (TBR) distance, calculated using TBRDist; unnormalized.

  • rf: Robinson-Foulds distance (Robinson & Foulds 1981), unnormalized.

  • icrf: Robinson-Foulds distance, splits weighted by phylogenetic information content (Smith 2020), unnormalized.

  • path: Path distance (Steel & Penny 1993), unnormalized.

  • mafi (pectinateDistances11 only): information content of the maximum agreement forest (Smith 2020).

Source

Scripts used to generate data objects are housed in the data-raw directory.

References

\insertRef

Bogdanowicz2012TreeDist

\insertRef

Li1996TreeDist

\insertRef

Kendall2016TreeDistData

\insertRef

Nye2006TreeDist

\insertRef

Robinson1981TreeDist

\insertRef

Smith2019TreeDist

\insertRef

SmithDistTreeDist

\insertRef

Steel1993TreeDist

\insertRef

Valiente2009TreeDist

See Also

Tree pairs between which distances were calculated are available in data objects randomTreePairs25 and randomTreePairs50.


ms609/TreeDistData documentation built on June 30, 2024, 7:21 p.m.