subset-methods: Methods for creating subsets of phylogenies

subset-methodsR Documentation

Methods for creating subsets of phylogenies

Description

Methods for creating subsets of phylogenies, based on pruning a tree to include or exclude a set of terminal taxa, to include all descendants of the MRCA of multiple taxa, or to return a subtree rooted at a given node.

Usage

subset(x, ...)

## S4 method for signature 'phylo4'
subset(
  x,
  tips.include = NULL,
  tips.exclude = NULL,
  mrca = NULL,
  node.subtree = NULL,
  ...
)

x[i, ...]

## S4 method for signature 'phylo4,character,missing,missing'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'phylo4,numeric,missing,missing'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'phylo4,logical,missing,missing'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'phylo4,missing,missing,missing'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'phylo4d,ANY,character,missing'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'phylo4d,ANY,numeric,missing'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'phylo4d,ANY,logical,missing'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'phylo4,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]

prune(x, ...)

## S4 method for signature 'phylo4'
prune(x, tips.exclude, trim.internal = TRUE)

## S4 method for signature 'phylo4d'
prune(x, tips.exclude, trim.internal = TRUE)

Arguments

x

an object of class "phylo4" or "phylo4d"

...

optional additional parameters (not in use)

tips.include

A vector of tips to include in the subset tree

tips.exclude

A vector of tips to exclude from the subset tree

mrca

A vector of nodes for determining the most recent common ancestor, which is then used as the root of the subset tree

node.subtree

A single internal node specifying the root of the subset tree

i

([ method) An index vector indicating tips to include

j

([ method, phylo4d only) An index vector indicating columns of node/tip data to include

drop

(not in use: for compatibility with the generic method)

trim.internal

A logical specifying whether to remove internal nodes that no longer have tip descendants in the subset tree

Details

The subset methods must be called using no more than one of the four main subsetting criteria arguments (tips.include, tips.exclude, mrca, or node.subtree). Each of these arguments can be either character or numeric. In the first case, they are treated as node labels; in the second case, they are treated as node numbers. For the first two arguments, any supplied tips not found in the tree (tipLabels(x)) will be ignored, with a warning. Similarly, for the mrca argument, any supplied tips or internal nodes not found in the tree will be ignored, with a warning. For the node.subtree argument, failure to provide a single, valid internal node will result in an error.

Although prune is mainly intended as the workhorse function called by subset, it may also be called directly. In general it should be equivalent to the tips.exclude form of subset (although perhaps with less up-front error checking).

The "[" operator, when used as x[i], is similar to the tips.include form of subset. However, the indices used with this operator can also be logical, in which case the corresponding tips are assumed to be ordered as in nodeId(x, "tip"), and recycling rules will apply (just like with a vector or a matrix). With a phylo4d object 'x', x[i,j] creates a subset of x taking i for a tip index and j for the index of data variables in tdata(geospiza, "all"). Note that the second index is optional: x[i, TRUE], x[i,], and x[i] are all equivalent.

Regardless of which approach to subsetting is used, the argument values must be such that at least two tips are retained.

If the most recent common ancestor of the retained tips is not the original root node, then the root node of the subset tree will be a descendant of the original root. For rooted trees with non-NA root edge length, this has implications for the new root edge length. In particular, the new length will be the summed edge length from the new root node back to the original root (including the original root edge). As an alternative, see the examples for a way to determine the length of the edge that was immediately ancestral to the new root node in the original tree.

Note that the correspondance between nodes and labels (and data in the case of phylo4d) will be retained after all forms of subsetting. Beware, however, that the node numbers (IDs) will likely be altered to reflect the new tree topology, and therefore cannot be compared directly between the original tree and the subset tree.

Value

an object of class "phylo4" or "phylo4d"

Methods

x = "phylo4"

subset tree

x = "phylo4d"

subset tree and corresponding node and tip data

Author(s)

Jim Regetz regetz@nceas.ucsb.edu
Steven Kembel skembel@berkeley.edu
Damien de Vienne damien.de-vienne@u-psud.fr
Thibaut Jombart jombart@biomserv.univ-lyon1.fr

Examples

data(geospiza)
nodeLabels(geospiza) <- paste("N", nodeId(geospiza, "internal"), sep="")
geotree <- extractTree(geospiza)

## "subset" examples
tips <- c("difficilis", "fortis", "fuliginosa", "fusca", "olivacea",
    "pallida", "parvulus", "scandens")
plot(subset(geotree, tips.include=tips))
plot(subset(geotree, tips.include=tips, trim.internal=FALSE))
plot(subset(geotree, tips.exclude="scandens"))
plot(subset(geotree, mrca=c("scandens","fortis","pauper")))
plot(subset(geotree, node.subtree=18))

## "prune" examples (equivalent to subset using tips.exclude)
plot(prune(geotree, tips))

## "[" examples (equivalent to subset using tips.include)
plot(geotree[c(1:6,14)])
plot(geospiza[c(1:6,14)])

## for phylo4d, subset both tips and data columns
geospiza[c(1:6,14), c("wingL", "beakD")]

## note handling of root edge length:
edgeLength(geotree)['0-15'] <- 0.1
geotree2 <- geotree[1:2]
## in subset tree, edge of new root extends back to the original root
edgeLength(geotree2)['0-3']
## edge length immediately ancestral to this node in the original tree
edgeLength(geotree, MRCA(geotree, tipLabels(geotree2)))

phylobase documentation built on May 29, 2024, 11:24 a.m.