bd.sim.traits: MuSSE simulation

View source: R/bd.sim.traits.R

bd.sim.traitsR Documentation

MuSSE simulation

Description

Simulates a species birth-death process following the Multiple State-dependent Speciation and Extinction (MuSSE) or the Hidden State-dependent Speciation and Extinction (HiSSE) model for any number of starting species. Allows for the speciation/extinction rate to be (1) a constant, or (2) a list of values for each trait state. Traits are simulated to evolve under a simple Mk model (see references). Results can be conditioned on either total simulation time, or total number of extant species at the end of the simulation. Also allows for constraining results on a range of number of species at the end of the simulation, either total or extant, using rejection sampling. Returns a sim object (see ?sim), and a list of data frames describing trait values for each interval. It may return true extinction times or simply information on whether species lived after the maximum simulation time, depending on input. Can simulate any number of traits, but rates need to depend on only one (each, so speciation and extinction can depend on different traits).

Usage

bd.sim.traits(
  n0,
  lambda,
  mu,
  tMax = Inf,
  N = Inf,
  nTraits = 1,
  nFocus = 1,
  nStates = 2,
  nHidden = 1,
  X0 = 0,
  Q = list(matrix(c(0, 0.1, 0.1, 0), ncol = 2, nrow = 2)),
  nFinal = c(0, Inf),
  nExtant = c(0, Inf)
)

Arguments

n0

Initial number of species. Usually 1, in which case the simulation will describe the full diversification of a monophyletic lineage. Note that when lambda is less than or equal to mu, many simulations will go extinct before speciating even once. One way of generating large sample sizes in this case is to increase n0, which will simulate the diversification of a paraphyletic group.

lambda

Vector to hold the speciation rate over time. It should either be a constant, or a list of size nStates. For each species a trait evolution simulation will be run, and then used to calculate the final speciation rate. Note that lambda should always be greater than or equal to zero.

mu

Similar to above, but for the extinction rate.

tMax

Ending time of simulation, in million years after the clade origin. Any species still living after tMax is considered extant, and any species that would be generated after tMax is not present in the return.

N

Number of species at the end of the simulation. End of the simulation will be set for one of the times where N species are alive, chosen from all the times there were N species alive weighted by how long the simulation was in that situation. Exactly one of tMax and N must be non-Inf. Note that if N is the chosen condition, mu cannot be 0, since paelobuddy's current algorithm would mean only species 1 would have children. Future features will hopefully remove this limitation.

nTraits

The number of traits to be considered. lambda and mu need not reference every trait simulated.

nFocus

Trait of focus, i.e. the one that rates depend on. If it is one number, that will be the trait of focus for both speciation and extinction rates. If it is of length 2, the first will be the focus for the former, the second for the latter.

nStates

Number of possible states for categorical trait. The range of values will be assumed to be (0, nStates - 1). Can be a constant or a vector of length nTraits, if traits are intended to have different numbers of states.

nHidden

Number of hidden states for categorical trait. Default is 1, in which case there are no added hidden traits. Total number of states is then nStates * nHidden. States will then be set to a value in the range of (0, nStates - 1) to simulate that hidden states are hidden. This is done by setting the value of a state to the remainder of state / nStates. E.g. if nStates = 2 and nHidden = 3, possible states during simulation will be in the range (0, 5), but states (2, 4) (corresponding to (0B, 0C) in the nomenclature of the original HiSSE reference) will be set to 0, and states (3, 5) (corresponding to (1B, 1C)) to 1.

X0

Initial trait value for original species. Must be within (0, nStates - 1). Can be a constant or a vector of length nTraits.

Q

Transition rate matrix for continuous-time trait evolution. For different states i and j, the rate at which a species at i transitions to j is Q[i + 1, j + 1]. Must be within a list, so as to allow for different Q matrices when nTraits > 1.

Note that for all of nStates, nHidden, X0 and Q, if nTraits > 1 and any of those is of length 1, they will be considered to apply to all traits equally. This might lead to problems if, e.g., two traits have different states but the same Q, so double check that you are providing all parameters for the required traits.

nFinal

A vector of length 2, indicating an interval of acceptable number of species at the end of the simulation. Default value is c(0, Inf), so that any number of species (including zero, the extinction of the whole clade) is accepted. If different from default value, simulation will restart until the number of total species at tMax is in the nFinal interval. Note that nFinal must be a sensible vector. The function will error if its maximum is lower than 1, or if its length is not 2.

nExtant

A vector of length 2, indicating an interval of acceptable number of extant species at the end of the simulation. Equal to nFinal in every respect except for that.

Note: The function returns NA if it runs for more than 100000 iterations without fulfilling the requirements of nFinal and nExtant.

Note: Using values other than the default for nFinal and nExtant will condition simulation results.

Details

Please note while time runs from 0 to tMax in the simulation, it returns speciation/extinction times as tMax (origin of the group) to 0 (the "present" and end of simulation), so as to conform to other packages in the literature.

Value

A sim object, containing extinction times, speciation times, parent, and status information for each species in the simulation, and a list object with the trait data frames describing the trait value for each species at each specified interval.

Author(s)

Bruno do Rosario Petrucci.

References

Maddison W.P., Midford P.E., Otto S.P. 2007. Estimating a binary character’s effect on speciation and extinction. Systematic Biology. 56(5):701.

FitzJohn R.G. 2012. Diversitree: Comparative Phylogenetic Analyses of Diversification in R. Methods in Ecology and Evolution. 3:1084–1092.

Beaulieu J.M., O'Meara, B.C. 2016. Detecting Hidden Diversification Shifts in Models of Trait-Dependent Speciation and Extinction. Systematic Biology. 65(4):583-601.

Examples


###
# first, it's good to check that it can work with constant rates

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 40

# speciation
lambda <- 0.1

# extinction
mu <- 0.03

# set seed
set.seed(1)

# run the simulation, making sure we have more than one species in the end
sim <- bd.sim.traits(n0, lambda, mu, tMax, nFinal = c(2, Inf))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)
  ape::plot.phylo(phy)
}

###
# now let's actually make it trait-dependent, a simple BiSSE model

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 40

# speciation, higher for state 1
lambda <- c(0.1, 0.2)

# extinction, trait-independent
mu <- 0.03

# number of traits and states (1 binary trait)
nTraits <- 1
nStates <- 2

# initial value of the trait
X0 <- 0

# transition matrix, with symmetrical transition rates
Q <- list(matrix(c(0, 0.1,
                   0.1, 0), ncol = 2, nrow = 2))

# set seed
set.seed(1)

# run the simulation
sim <- bd.sim.traits(n0, lambda, mu, tMax, nTraits = nTraits, 
                    nStates = nStates, X0 = X0, Q = Q, nFinal = c(2, Inf))

# get trait values for all tips
traits <- unlist(lapply(sim$TRAITS, function(x) tail(x[[1]]$value, 1)))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)
  
  # color 0 valued tips red and 1 valued tips blue
  ape::plot.phylo(phy, tip.color = c("red", "blue")[traits + 1])
}

###
# extinction can be trait-dependent too, of course

# initial number of species
n0 <- 1

# number of species at the end of the simulation
N <- 20

# speciation, higher for state 1
lambda <- c(0.1, 0.2)

# extinction, highe for state 0
mu <- c(0.06, 0.03)

# number of traits and states (1 binary trait)
nTraits <- 1
nStates <- 2

# initial value of the trait
X0 <- 0

# transition matrix, with symmetrical transition rates
Q <- list(matrix(c(0, 0.1,
                   0.1, 0), ncol = 2, nrow = 2))

# set seed
set.seed(1)

# run the simulation
sim <- bd.sim.traits(n0, lambda, mu, N = N, nTraits = nTraits,
                    nStates = nStates, X0 = X0, Q = Q, nFinal = c(2, Inf))

# get trait values for all tips
traits <- unlist(lapply(sim$TRAITS, function(x) tail(x[[1]]$value, 1)))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)
  
  # color 0 valued tips red and 1 valued tips blue
  ape::plot.phylo(phy, tip.color = c("red", "blue")[traits + 1])
}

###
# we can complicate the model further by making transition rates asymmetric

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 20

# speciation, higher for state 1
lambda <- c(0.1, 0.2)

# extinction, lower for state 1
mu <- c(0.03, 0.01)

# number of traits and states (1 binary trait)
nTraits <- 1
nStates <- 2

# initial value of the trait
X0 <- 0

# transition matrix, with q01 higher than q10
Q <- list(matrix(c(0, 0.1,
                   0.25, 0), ncol = 2, nrow = 2))

# set seed
set.seed(1)

# run the simulation
sim <- bd.sim.traits(n0, lambda, mu, tMax, nTraits = nTraits, 
                    nStates = nStates, X0 = X0, Q = Q, nFinal = c(2, Inf))

# get trait values for all tips
traits <- unlist(lapply(sim$TRAITS, function(x) tail(x[[1]]$value, 1)))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)
  
  # color 0 valued tips red and 1 valued tips blue
  ape::plot.phylo(phy, tip.color = c("red", "blue")[traits + 1])
}

###
# MuSSE is BiSSE but with higher numbers of states

# initial number of species
n0 <- 1

# number of species at the end of the simulation
N <- 20

# speciation, higher for state 1, highest for state 2
lambda <- c(0.1, 0.2, 0.3)

# extinction, higher for state 2
mu <- c(0.03, 0.03, 0.06)

# number of traits and states (1 trinary trait)
nTraits <- 1
nStates <- 3

# initial value of the trait
X0 <- 0

# transition matrix, with symmetrical, fully reversible transition rates
Q <- list(matrix(c(0, 0.1, 0.1,
                   0.1, 0, 0.1,
                   0.1, 0.1, 0), ncol = 3, nrow = 3))

# set seed
set.seed(1)

# run the simulation
sim <- bd.sim.traits(n0, lambda, mu, N = N, nTraits = nTraits, 
                    nStates = nStates, X0 = X0, Q = Q, nFinal = c(2, Inf))

# get trait values for all tips
traits <- unlist(lapply(sim$TRAITS, function(x) tail(x[[1]]$value, 1)))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)
  
  # 0 tips = red, 1 tips = blue, 2 tips = green
  ape::plot.phylo(phy, tip.color = c("red", "blue", "green")[traits + 1])
}

###
# HiSSE is like BiSSE, but with the possibility of hidden traits
# here we have 4 states, representing two states for the observed trait
# (0 and 1) and two for the hidden trait (A and B), i.e. 0A, 1A, 0B, 1B

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 20

# speciation, higher for state 1A, highest for 1B
lambda <- c(0.1, 0.2, 0.1, 0.3)

# extinction, lowest for 0B
mu <- c(0.03, 0.03, 0.01, 0.03)

# number of traits and states (1 binary observed trait, 
# 1 binary hidden trait)
nTraits <- 1
nStates <- 2
nHidden <- 2

# initial value of the trait
X0 <- 0

# transition matrix, with symmetrical transition rates. Only one transition
# is allowed at a time, i.e. 0A can go to 0B and 1A,
# but not to 1B, and similarly for others
Q <- list(matrix(c(0, 0.1, 0.1, 0,
                   0.1, 0, 0, 0.1,
                   0.1, 0, 0, 0.1,
                   0, 0.1, 0.1, 0), ncol = 4, nrow = 4))

# set seed
set.seed(1)

# run the simulation
sim <- bd.sim.traits(n0, lambda, mu, tMax, nTraits = nTraits,
                    nStates = nStates, nHidden = nHidden,
                    X0 = X0, Q = Q, nFinal = c(2, Inf))

# get trait values for all tips
traits <- unlist(lapply(sim$TRAITS, function(x) tail(x[[1]]$value, 1)))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)

  # color 0 valued tips red and 1 valued tips blue
  ape::plot.phylo(phy, tip.color = c("red", "blue")[traits + 1])
}

###
# we can also increase the number of traits, e.g. to have a neutral trait 
# evolving with the real one to compare the estimates of the model for each

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 20

# speciation, higher for state 1
lambda <- c(0.1, 0.2)

# extinction, lowest for state 0
mu <- c(0.01, 0.03)

# number of traits and states (2 binary traits)
nTraits <- 2
nStates <- 2

# initial value of both traits
X0 <- 0

# transition matrix, with symmetrical transition rates for trait 1,
# and asymmetrical (and higher) for trait 2
Q <- list(matrix(c(0, 0.1,
                   0.1, 0), ncol = 2, nrow = 2),
          matrix(c(0, 1,
                   0.5, 0), ncol = 2, nrow = 2))

# set seed
set.seed(1)

# run the simulation
sim <- bd.sim.traits(n0, lambda, mu, tMax, nTraits = nTraits, 
                    nStates = nStates, X0 = X0, Q = Q, nFinal = c(2, Inf))

# get trait values for all tips
traits1 <- unlist(lapply(sim$TRAITS, function(x) tail(x[[1]]$value, 1)))
traits2 <- unlist(lapply(sim$TRAITS, function(x) tail(x[[2]]$value, 1)))

# make index for coloring tips
index <- ifelse(!(traits1 | traits2), "red", 
                ifelse(traits1 & !traits2, "purple",
                       ifelse(!traits1 & traits2, "magenta", "blue")))
# 00 = red, 10 = purple, 01 = magenta, 11 = blue

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)
  
  # color 0 valued tips red and 1 valued tips blue
  ape::plot.phylo(phy, tip.color = index)
}

###
# we can then do the same thing, but with the
# second trait controlling extinction

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 20

# speciation, higher for state 10 and 11
lambda <- c(0.1, 0.2)

# extinction, lowest for state 00 and 01
mu <- c(0.01, 0.03)

# number of traits and states (2 binary traits)
nTraits <- 2
nStates <- 2
nFocus <- c(1, 2)

# initial value of both traits
X0 <- 0

# transition matrix, with symmetrical transition rates for trait 1,
# and asymmetrical (and higher) for trait 2
Q <- list(matrix(c(0, 0.1,
                   0.1, 0), ncol = 2, nrow = 2),
          matrix(c(0, 1,
                   0.5, 0), ncol = 2, nrow = 2))

# set seed
set.seed(1)

# run the simulation
sim <- bd.sim.traits(n0, lambda, mu, tMax, nTraits = nTraits, 
                    nStates = nStates, nFocus = nFocus,
                    X0 = X0, Q = Q, nFinal = c(2, Inf))

# get trait values for all tips
traits1 <- unlist(lapply(sim$TRAITS, function(x) tail(x[[1]]$value, 1)))
traits2 <- unlist(lapply(sim$TRAITS, function(x) tail(x[[2]]$value, 1)))

# make index for coloring tips
index <- ifelse(!(traits1 | traits2), "red", 
                ifelse(traits1 & !traits2, "purple",
                       ifelse(!traits1 & traits2, "magenta", "blue")))
# 00 = red, 10 = purple, 01 = magenta, 11 = blue

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)
  
  # color 0 valued tips red and 1 valued tips blue
  ape::plot.phylo(phy, tip.color = index)
}

###
# as a final level of complexity, let us change the X0
# and number of states of the trait controlling extinction

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 20

# speciation, higher for state 10 and 11
lambda <- c(0.1, 0.2)

# extinction, lowest for state 00, 01, and 02
mu <- c(0.01, 0.03, 0.03)

# number of traits and states (2 binary traits)
nTraits <- 2
nStates <- c(2, 3)
nFocus <- c(1, 2)

# initial value of both traits
X0 <- c(0, 2)

# transition matrix, with symmetrical transition rates for trait 1,
# and asymmetrical, directed, and higher rates for trait 2
Q <- list(matrix(c(0, 0.1,
                   0.1, 0), ncol = 2, nrow = 2),
          matrix(c(0, 1, 0,
                   0.5, 0, 0.75,
                   0, 1, 0), ncol = 3, nrow = 3))

# set seed
set.seed(1)

# run the simulation
sim <- bd.sim.traits(n0, lambda, mu, tMax, nTraits = nTraits, 
                    nStates = nStates, nFocus = nFocus,
                    X0 = X0, Q = Q, nFinal = c(2, Inf))

# get trait values for all tips
traits1 <- unlist(lapply(sim$TRAITS, function(x) tail(x[[1]]$value, 1)))
traits2 <- unlist(lapply(sim$TRAITS, function(x) tail(x[[2]]$value, 1)))

# make index for coloring tips
index <- ifelse(!(traits1 | (traits2 != 0)), "red", 
                ifelse(traits1 & (traits2 == 0), "purple",
                       ifelse(!traits1 & (traits2 == 1), "magenta", 
                              ifelse(traits1 & (traits2 == 1), "blue",
                                     ifelse(!traits1 & (traits2 == 2), 
                                            "orange", "green")))))
                                     
# 00 = red, 10 = purple, 01 = magenta, 11 = blue, 02 = orange, 12 = green

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim$SIM)
  
  # color 0 valued tips red and 1 valued tips blue
  ape::plot.phylo(phy, tip.color = index)
}
# one could further complicate the model by adding hidden states
# to each trait, each with its own number etc, but these examples
# include all the tools necessary to make these or further extensions


brpetrucci/PaleoBuddy documentation built on Feb. 28, 2025, 3:53 p.m.