treeDists: Provide objects for determining distances among nodes of a...

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/placeTools.R

Description

Provides objects (dists, paths) that can be used to calculate vectors of distances between an internal node and each leaf node. Also returns a square matrix of distances between leaf nodes.

Usage

1
treeDists(placefile, distfile)

Arguments

placefile

path to pplacer output

distfile

path to output of guppy distmat

Details

A placement on an edge looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
 proximal
  |
  |   d_p
  |
  |---- x
  |
  |   d_d
  |
  |
 distal

d_p is the distance from the placement x to the proximal side of the edge, and d_d the distance to the distal side.

If the distance from x to a leaf y is an S-distance Q, then the path from x to y will go through the distal side of the edge and we will need to add d_d to Q to get the distance from x to y. If the distance from x to a leaf y is a P-distance Q, then the path from x to y will go through the proximal side of the edge, and we will need to subtract off d_d from Q to get the distance from x to y. In either case, we always need to add the length of the pendant edge, which is the second column.

To review, say the values of the two leftmost columns are a and b for a given placement x, and that it is on an edge i. We are interested in the distance of x to a leaf y, which is on edge j. We look at the distance matrix, entry (i,j), and say it is an S-distance Q. Then our distance is Q+a+b. If it is a P-distance Q, then the distance is Q-a+b.

The distances between leaves should always be P-distances, and there we need no trickery.

(thanks to Erick Matsen for this description)

Value

A list with the following elements:

dists

rectangular matrix of distances with rows corresponding to all nodes in pplacer order, and columns corresponding to tips in the order of the corresponding phylo{ape} object.

paths

rectangular matrix in the same configuration as dists with values of 1 or -1 if the path between nodes is serial or parallel, respectively (see Details)

dmat

square matrix containing distances between pairs of tips.

Note

The output of this function is required for classifyPlacements.

Author(s)

Noah Hoffman

References

Documentation for pplacer and guppy can be found here: http://matsen.fhcrc.org/pplacer/

See Also

classifyPlacements

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
placefile <- system.file('extdata','merged.json', package='clstutils')
distfile <- system.file('extdata','merged.distmat.bz2', package='clstutils')
treedists <- treeDists(placefile, distfile)

## coordinates of a single placement
placetab <- data.frame(at=49, edge=5.14909e-07, branch=5.14909e-07)

## dvects is a matrix in which each row corresponds to a vector of
## distances between a single placement along the edge of the reference
## tree used to generate 'distfile', and each column correspons to a
## reference sequence (ie, a terminal node).

dvects <- with(placetab, {
treedists$dists[at+1,,drop=FALSE] + treedists$paths[at+1,,drop=FALSE]*edge + branch
})

clstutils documentation built on Nov. 8, 2020, 5:23 p.m.