knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 5, fig.height = 5 )
library(FossilSim)
This vignette provides information about how the package stores and outputs information about species taxonomy via the taxonomy
object and available options for simulating species taxonomy.
A phylogenetic tree object, in the format used by most software packages, does not typically contain information about species over time.
A timescaled phylogeny is informative about the minimum number of co-existing lineages at a given moment but not about the start and end point of species or the mode of speciation.
Three modes of speciation that have been discussed widely in palaeobiological literature are shown in Fig. 1. and can be modelled using FossilSim
:
knitr::include_graphics("speciation_types.jpg") cap = "**Fig. 1**. Three different modes of speciation that can be simulated using FossilSim. Budding and bifurcation are also referred to as asymmetric and symmetric speciation."
The distinction between these processes is important because they have consequences on parameter estimates using different methods.
In molecular phylogenetics, we typically draw trees as fully bifurcating structures (i.e. each node gives rise to two descendant branches).
This is also how trees are stored in computer memory by most phylogenetic software packages, including ape
.
However, it may not be the case that each internal edge represents a unique species.
Instead, some speciation events may be budding rather than bifurcating or anagenesis may have occurred at some point along internal edges.
To explore the impact of different speciation modes it is valuable to know the relationship between species through time (the taxonomy) and the corresponding tree object.
FossilSim
contains options for simulating species evolution under different speciation modes,
described in the section [Simulating taxonomy].
Information about the true species taxonomy is stored in the taxonomy
object, which will be associated with the corresponding phylo
object used to simulate species evolution.
The object contains a dataframe detailing species taxonomy, which contains the following 7 fields for each edge and species in the corresponding phylo
object:
sp
the true species identity label. If all species originated via budding or bifurcation this will always correspond to the terminal-most edge label (i.e. the youngest node) associated with each species. This is not the case if the data set also contains anagenetic species, when multiple species may be associated with a single edge
edge
edge label of the branch in the corresponding phylo object. Note that some species may be associated with multiple edges
parent
ancestor of species sp
. Parent labels follow the same convention as species. The label assigned to the parent of the origin or root will be zero
start
start time of the corresponding edge and/or origin time of the species. If the corresponding edge is also the oldest edge associated with the species this value will equal the species origination time. If speciation mode is asymmetric or symmetric the speciation time will match the start time of the corresponding edge. If speciation mode is anagenetic the speciation time will be younger than the start time of the corresponding edge
end
end time of the corresponding edge and/or end time of the species. If the corresponding edge is also the youngest edge associated with the species this value will equal the species end time. Unless the species end time coincides with an anagenetic speciation event, the species end time will match the end time of the corresponding edge. If the species end time coincides with an anagenetic speciation event, the end time will be older than the end time of the corresponding edge
mode
mode = speciation mode. "o" = origin or "r" = root (the edge/species that began the process). "b" = asymmetric or budding speciation. "s" = symmetric or bifurcating speciation. "a" = anagenetic speciation
A new taxonomy
object can be created by passing the above information to the taxonomy()
function.
Cryptic species can also be simulated. Under cryptic speciation the descendant species will be indistinguishable from the ancestor and the taxonomy
object can incorporate this information using the additional optional fields:
cryptic
TRUE if the speciation event was cryptic. If this information is not passed to taxonomy()
, the function assumes cryptic = FALSE
for all species
cryptic.id
cryptic species identity, i.e. ancestral species that the current species will be identified as. If cryptic = TRUE
, cryptic.id
will differ from the true species identity sp
The relationship between information provided by the taxonomy
object and the corresponding phylo
object will become clear when you start simulating species and exploring the output.
The taxonomy
object can be used as the starting point for simulating fossils or other downstream analyses that require information about discrete species units.
Special Note The birth and death rates used as parameters in the birth-death process typically do not correspond to the rates of appearance and extinction of species under the above processes, unless speciation occurs entirely via budding.
Species evolution or taxonomy can be simulated in FossilSim
under a mixed model of speciation that can incorporate three modes of speciation -- budding (or asymmetric), bifurcating (or symmetric) and anagenetic -- in addition to cryptic speciation [@Stadler2018].
This model of mixed speciation was described previously in [@Bapst2013b] and is also available as part of the paleotree
package [@Bapst2012].
In a birth-death process $\lambda$ is the rate at which branching events occur, while $\mu$ is the rate at which lineage termination occurs. These are the parameters typically used to simulate birth-death trees (see the code snippet below for an example). The mixed model of speciation includes three additional parameters, $\beta, \lambda_a$ and $\kappa$, which are defined as follows:
The relationship between these processes and the underlying tree structure in the corresponding phylo
object is described in section [The taxonomy object].
Simulating species evolution or taxonomy for any user specified tree in FossilSim
is straightforward.
# set the random number generator seed to generate the same results using the same code set.seed(123) # simulate a tree using TreeSim conditioned on tip number lambda = 1 mu = 0.2 tips = 8 t = TreeSim::sim.bd.taxa(n = tips, numbsim = 1, lambda = lambda, mu = mu)[[1]] # t is an object of class phylo t # use t$edge, t$edge.length, t$root.edge to see the tree attributes # simulate under complete budding speciation s = sim.taxonomy(tree = t) # this is equivalent to using the default parameters beta = 0, lambda_a = 0, kappa = 0 # s is an object of class taxonomy s
Note that the way sim.taxonomy
assigns budding species is deterministic -- at each branching event, the "left" descendant edge will always be assigned the ancestral species label and the "right" descendant edge will always be the new species.
This means even without setting the random number seed sim.taxonomy
will always produce the same output when $\beta = 0$.
FossilSim
contains various options for plotting the package output.
The plot.taxonomy
function will produce a plot highlighting species taxonomy along with the corresponding bifurcating tree object.
Note that R detects what type of object you are trying to plot, in this case taxonomy, and will automatically call plot.taxonomy
when you apply the function plot
to an object of class taxonomy
.
Use ?plot.taxonomy
to view further options that can be used to change the appearance of the figure.
plot(s, tree = t, legend.position = "topleft")
# simulate under complete bifurcating speciation s = sim.taxonomy(tree = t, beta = 1) plot(s, tree = t, legend.position = "topleft")
# simulate under mixed speciation s = sim.taxonomy(tree = t, beta = 0.5, lambda.a = 1, kappa = 0.1) plot(s, tree = t, legend.position = "topleft")
Note that plot.taxonomy
only plots the true species, and ignores any information about cryptic speciation (i.e. the function uses the sp
labels and ignores cryptic.id
).
Given an existing taxonomy
object you can also add anagenetic or cryptic species downstream using the functions sim.anagenetic.species
and sim.cryptic.species
.
These functions also return a taxonomy
object.
# simulate taxonomy without anagenetic or cryptic species s1 = sim.taxonomy(tree = t, beta = 0.5) # simulate anagenetic species # note this function also requires the corresponding tree object s2 = sim.anagenetic.species(tree = t, species = s1, lambda.a = 1) # simulate cryptic species s3 = sim.cryptic.species(species = s2, kappa = 0.1)
These functions can not be used with taxonomy
objects that already contain anagenetic or cryptic species.
See the paleotree
vignette to see how FossilSim
objects can be converted into paleotree
objects.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.