make.phylo: Phylogeny generating

View source: R/make.phylo.R

make.phyloR Documentation

Phylogeny generating

Description

Generates a phylogeny from a sim object containing speciation and extinction times, parent and status information (see ?sim). Returns a phylo object containing information on the phylogeny, following an "evolutionary Hennigian" (sensu Ezard et al 2011) format (i.e., a bifurcating tree). Takes an optional argument encoding fossil occurrences to return a sampled ancestor tree (see references). This tree consists of the original tree, plus the fossil occurrences added as branches of length 0 branching off of the corresponding species at the time of occurrence. Such trees can be used, as is or with small modifications, as starting trees in phylogenetic inference software that make use of the fossilized birth-death model. Returns NA and sends a warning if the simulation has only one lineage or if more than one species has NA as parent (i.e. there is no single common ancestor in the simulation). In the latter case, please use find.lineages first.

Usage

make.phylo(
  sim,
  fossils = NULL,
  saFormat = "branch",
  returnTrueExt = TRUE,
  returnRootTime = NULL
)

Arguments

sim

A sim object, containing extinction times, speciation times, parent, and status information for each species in the simulation. See ?sim.

fossils

A data frame with a "Species" column and a SampT column, usually an output of the sample.clade function. Species names must contain only one number each, corresponding to the order of the sim vectors. Note that we require it to have a SampT column, i.e. fossils must have an exact age. This assumption might be relaxed in the future.

saFormat

A string indicating which form sampled ancestors should take in the tree. If set to "branch" (default), SAs are returned as 0-length branches. If set to "node", they are returned as degree-2 nodes. Note that some software prefer the former (e.g. Beast) and some the latter (e.g. RevBayes). The code for making 0-length branches become nodes was written by Joshua A. Justison.

returnTrueExt

A logical indicating whether to include in tree the tips representing the true extinction time of extinct species. If set to FALSE, the returned tree will include those tips. If TRUE (default), they will be dropped and instead the last sampled fossil of a given species will be the last sampled tip of that species. Note that if a species was not sampled it will then not appear in the tree. If no fossils have been added to the tree with the parameter fossils, this will have the same effect as the ape function drop.fossil, returning an ultrametric tree. Note that if this is set to FALSE, the root.time and root.edge arguments will not be accurate, depending on which species are dropped. The user is encouraged to use the ape package to correct these problems, as shown in an example below.

returnRootTime

Logical indicating if phylo should have information regarding root.time. If set to NULL (default), returned phylogenies will not have root.time if there is at least one extant lineage in the sim object. If there are only extinct lineages in the sim object and it is set to NULL, root.time will be returned. If set to FALSE or TRUE, root.time will be removed or forced into the phylo object, respectively. In this case, we highly recommend users to read about the behavior of some functions (such as APE's axisPhylo) when this argument is forced.

Details

When root.time is added to a phylogeny, packages such as APE can change their interpretation of the information in the phylo object. For instance, a completely extinct phylogeny might be interpreted as extant if there is no info about root.time. This might create misleading interpretations even with simple functions such as ape::axisPhylo. make.phylo tries to accommodate different evo/paleo practices in its default value for returnRootTime by automatically attributing root.time when the sim object is extinct. We encourage careful inspection of output if users force make.phylo to use a specific behavior, especially when using phylogenies generated by this function as input in functions from other packages. For extinct phylogenies, it might usually be important to explicitly provide information that the edge is indeed a relevant part of the phylogeny (for instance adding root.edge = TRUE when plotting a phylogeny with root.time information with ape::plot.phylo. An example below provides a visualization of this issue.

Value

A phylo object from the APE package. Tip labels are numbered following the order of species in the sim object. If fossil occurrence data was supplied, the tree will include fossil occurrences as tips with branch length 0, bifurcating at its sampling time from the corresponding species' edge (i.e. a sampled ancestor tree). Note that to obtain a true sampled ancestor (SA) tree, one must perform the last step of deleting tips that are not either extant or fossil occurrences (i.e. the tips at true time of extinction).

Note this package does not depend on APE (Paradis et al, 2004) since it is never used inside its functions, but it is suggested since one might want to manipulate the phylogenies generated by this function. Furthermore, a limited version of the drop.tip function from APE has been copied for use in this function (namely, due to the parameter returnTrueExt). Likewise, a limited version of collapse.singles and node.depth.edgelength were also copied to support those features. One does not need to have APE installed for the function to use that code, but the authors wished to do their due diligence by crediting the package and its maintainers.

Author(s)

Matheus Januario and Bruno do Rosario Petrucci

References

Ezard, T. H., Pearson, P. N., Aze, T., & Purvis, A. (2012). The meaning of birth and death (in macroevolutionary birth-death models). Biology letters, 8(1), 139-142.

Paradis, E., Claude, J., Strimmer, & K. (2004). APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics, 20(2), 289-290.

Heath, T. A., Huelsenbeck, J. P., & Stadler, T. (2014). The fossilized birth–death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences, 111(29), E2957-E2966.

Examples


###
# we can start with a simple phylogeny

# set a simulation seed
set.seed(1) 

# simulate a BD process with constant rates
sim <- bd.sim(n0 = 1, lambda = 0.3, mu = 0.1, tMax = 10, 
             nExtant = c(2, Inf))

# make the phylogeny
phy <- make.phylo(sim)

# plot it
if (requireNamespace("ape", quietly = TRUE)) {
  # store old par settings
  oldPar <- par(no.readonly = TRUE) 
  
  # change par to show phylogenies
  par(mfrow = c(1, 2))
  
  ape::plot.phylo(phy)
  
  # we can also plot only the molecular phylogeny
  ape::plot.phylo(ape::drop.fossil(phy))
  
  # reset par
  par(oldPar)
}

###
# this works for sim generated with any of the scenarios in bd.sim

# set seed
set.seed(1)

# simulate
sim <- bd.sim(n0 = 1, lambda = function(t) 0.2 + 0.01*t, 
             mu = function(t) 0.03 + 0.015*t, tMax = 10, 
             nExtant = c(2, Inf))

# make the phylogeny
phy <- make.phylo(sim)

# plot it
if (requireNamespace("ape", quietly = TRUE)) {
  # store old par settings
  oldPar <- par(no.readonly = TRUE) 
  
  # change par to show phylogenies
  par(mfrow = c(1, 2))
  
  # plot phylogeny
  ape::plot.phylo(phy)
  ape::axisPhylo()
  
  # we can also plot only the molecular phylogeny
  ape::plot.phylo(ape::drop.fossil(phy))
  ape::axisPhylo()
  
  # reset par 
  par(oldPar)
}

### 
# we can use the fossils argument to generate a sample ancestors tree

# set seed
set.seed(1)

# simulate a simple birth-death process
sim <- bd.sim(n0 = 1, lambda = 0.2, mu = 0.05, tMax = 10, 
              nExtant = c(2, Inf))

# make the traditional phylogeny
phy <- make.phylo(sim)

# sample fossils
fossils <- sample.clade(sim, 0.1, 10)

# make the sampled ancestor tree
fbdPhy <- make.phylo(sim, fossils)

# plot them
if (requireNamespace("ape", quietly = TRUE)) {
  # store old par settings
  oldPar <- par(no.readonly = TRUE) 
  
  # visualize longevities and fossil occurrences
  draw.sim(sim, fossils = fossils)
  
  # change par to show phylogenies
  par(mfrow = c(1, 2))

  # phylogeny
  ape::plot.phylo(phy, main = "Phylogenetic tree")
  ape::axisPhylo()
  
  # sampled ancestor tree
  ape::plot.phylo(fbdPhy, main = "Sampled Ancestor tree")
  ape::axisPhylo()
  
  # reset par
  par(oldPar)
}

### 
# we can instead have the sampled ancestors as degree-2 nodes

# set seed
set.seed(1)

# simulate a simple birth-death process
sim <- bd.sim(n0 = 1, lambda = 0.2, mu = 0.05, tMax = 10, 
              nExtant = c(2, Inf))

# make the traditional phylogeny
phy <- make.phylo(sim)

# sample fossils
fossils <- sample.clade(sim, 0.1, 10)

# make the sampled ancestor tree
fbdPhy <- make.phylo(sim, fossils, saFormat = "node")

# plot them
if (requireNamespace("ape", quietly = TRUE)) {
  # store old par settings
  oldPar <- par(no.readonly = TRUE) 
  
  # visualize longevities and fossil occurrences
  draw.sim(sim, fossils = fossils)
  
  # change par to show phylogenies
  par(mfrow = c(1, 2))

  # phylogeny
  ape::plot.phylo(phy, main = "Phylogenetic tree")
  ape::axisPhylo()
  
  # sampled ancestor tree, need show.node.label parameter to see SAs
  ape::plot.phylo(fbdPhy, main = "Sampled Ancestor tree", 
                  show.node.label = TRUE)
  ape::axisPhylo()
  
  # reset par
  par(oldPar)
}

### 
# we can use the returnTrueExt argument to delete the extinct tips and
# have the last sampled fossil of a species as the fossil tip instead

# set seed
set.seed(5) 

# simulate a simple birth-death process
sim <- bd.sim(n0 = 1, lambda = 0.2, mu = 0.05, tMax = 10, 
              nExtant = c(2, Inf))

# make the traditional phylogeny
phy <- make.phylo(sim)

# sample fossils
fossils <- sample.clade(sim, 0.5, 10)

# make the sampled ancestor tree
fbdPhy <- make.phylo(sim, fossils, saFormat = "node", returnTrueExt = FALSE)
# returnTrueExt = FALSE means the extinct tips will be removed, 
# so we will only see the last sampled fossil (see tree below)

# plot them
if (requireNamespace("ape", quietly = TRUE)) {
  # store old par settings
  oldPar <- par(no.readonly = TRUE) 
  
  # visualize longevities and fossil occurrences
  draw.sim(sim, fossils = fossils)
  
  # change par to show phylogenies
  par(mfrow = c(1, 2))

  # phylogeny
  ape::plot.phylo(phy, main = "Phylogenetic tree")
  ape::axisPhylo()
  
  # sampled ancestor tree, need show.node.label parameter to see SAs
  ape::plot.phylo(fbdPhy, main = "Sampled Ancestor tree", 
                  show.node.label = TRUE)
  ape::axisPhylo()
  # note how t1.3 is an extinct tip now, as opposed to t1, since 
  # we would not know the exact extinction time for t1,
  # rather just see the last sampled fossil
  
  # reset par
  par(oldPar)
}

### 
# suppose in the last example, t2 went extinct and left no fossils
# this might lead to problems with the root.time object

# set seed
set.seed(5) 

# simulate a simple birth-death process
sim <- bd.sim(n0 = 1, lambda = 0.2, mu = 0.05, tMax = 10, 
              nExtant = c(2, Inf))

# make the traditional phylogeny
phy <- make.phylo(sim)

# sample fossils
fossils <- sample.clade(sim, 0.5, 10)

# make it so t2 is extinct
sim$TE[2] <- 9
sim$EXTANT[2] <- FALSE

# take out fossils of t2
fossils <- fossils[-which(fossils$Species == "t2"), ]

# make the sampled ancestor tree
fbdPhy <- make.phylo(sim, fossils, saFormat = "node", returnTrueExt = FALSE)
# returnTrueExt = FALSE means the extinct tips will be removed, 
# so we will only see the last sampled fossil (see tree below)

# plot them
if (requireNamespace("ape", quietly = TRUE)) {
  # store old par settings
  oldPar <- par(no.readonly = TRUE) 
  
  # visualize longevities and fossil occurrences
  draw.sim(sim, fossils = fossils)
  
  # change par to show phylogenies
  par(mfrow = c(1, 2))

  # phylogeny
  ape::plot.phylo(phy, main = "Phylogenetic tree")
  ape::axisPhylo()
  
  # sampled ancestor tree, need show.node.label parameter to see SAs
  ape::plot.phylo(fbdPhy, main = "Sampled Ancestor tree", 
                  show.node.label = TRUE)
  ape::axisPhylo()
  # note how t2 is gone, since it went extinct and left no fossils
  
  # this made it so the length of the tree + the root edge 
  # does not equal the origin time of the simulation anymore
  max(ape::node.depth.edgelength(fbdPhy)) + fbdPhy$root.edge
  # it should equal 10
  
  # to correct it, we need to set the root edge again
  fbdPhy$root.edge <- 10 - max(ape::node.depth.edgelength(fbdPhy))
  # this is necessary because ape does not automatically fix the root.edge
  # when species are dropped, and analyes using phylogenies + fossils
  # frequently condition on the origin of the process
  
  # reset par
  par(oldPar)
}

### 
# finally, we can test the usage of returnRootTime

# set seed
set.seed(1)

# simulate a simple birth-death process with more than one
# species and completely extinct:
sim <- bd.sim(n0 = 1, lambda = 0.5, mu = 0.5, tMax = 10, nExtant = c(0, 0))

# make a phylogeny using default values
phy <- make.phylo(sim)

# force phylo to not have root.time info
phy_rootless <- make.phylo(sim, returnRootTime = FALSE)

# plot them
if (requireNamespace("ape", quietly = TRUE)) {
  # store old par settings
  oldPar <- par(no.readonly = TRUE) 
  
  # change par to show phylogenies
  par(mfrow = c(1, 3))
  
  # if we use the default value, axisPhylo works as intended
  ape::plot.phylo(phy, root.edge = TRUE, main = "root.time default value")
  ape::axisPhylo()
  
  # note that without root.edge, we have incorrect times,
  # as APE assumes tMax was the time of first speciation
  ape::plot.phylo(phy, main = "root.edge not passed to plot.phylo")
  ape::axisPhylo()
  
  # if we force root.time to be FALSE, APE assumes the tree is
  # ultrametric, which leads to an incorrect time axis
  ape::plot.phylo(phy_rootless, main = "root.time forced as FALSE")
  ape::axisPhylo()
  # note time scale in axis
  
  # reset par
  par(oldPar)
}


brpetrucci/PaleoBuddy documentation built on Feb. 28, 2025, 3:53 p.m.