bd.sim: General rate Birth-Death simulation

Description Usage Arguments Value Author(s) Examples

View source: R/bd.sim.R

Description

Simulates a species birth-death process with general rates for any number of starting species. Allows for the speciation/extinction rate to be (1) a constant, (2) a function of time, (3) a function of time and an environmental variable, or (4) a vector of numbers. Allows for constraining results on the number of species at the end of the simulation, either total or extant. The function can also take an optional shape argument to generate age-dependence on speciation and/or extinction, assuming a Weibull distribution as a model of age-dependence. Returns an object containing vectors of speciation times, extinction times, parents (= species' mother species) and status at the end of the simulation (extant or not) for each species in the simulation. It may return true extinction times or simply information on whether species lived after the maximum simulation time. Please note while time runs from 0 to tMax in the simulation, it returns speciation/extinction times as tMax (origin of the group) to 0 (the "present" and end of simulation), so as to conform to other packages in the literature.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
bd.sim(
  n0,
  pp,
  qq,
  tMax,
  pShape = NULL,
  qShape = NULL,
  envPP = NULL,
  envQQ = NULL,
  pShifts = NULL,
  qShifts = NULL,
  nFinal = c(0, Inf),
  extOnly = FALSE,
  trueExt = FALSE
)

Arguments

n0

Initial number of species. Usually 1, in which case the simulation will describe the full diversification of a monophyletic lineage. Notice that when pp is less than or equal to qq, many simulations will go extinct before speciating even once. One way of generating simulations in this case is to increase n0. The user should notice this will simulate the diversification of a paraphyletic group.

pp

Speciation rate (per species per million years) over time. It can be a numeric describing a constant rate, a function(t) describing the variation in speciation over time t, a function(t, env) describing the variation in speciation over time following both time AND an environmental variable (please see envPP for details) or a vector containing rates that correspond to each rate between speciation rate shift times times (please see pShifts). Note that pp should should always be greater than or equal to zero.

qq

Similar to pp, but for the extinction rate.

Note: rates should be considered as running from 0 to tMax, as the simulation runs in that direction even though the function inverts times before returning in the end.

tMax

Ending time of simulation, in million years after the clade origin. Any species still living after tMax is considered extant, and any species that would be generated after tMax is not born.

pShape

Shape of the age-dependency in speciation rate. This will be equal to the shape parameter in a Weibull distribution: when smaller than one, speciation rate will decrease along each species' age (negative age-dependency). When larger than one, speciation rate will increase along each species's age (positive age-dependency). Default is NULL, equivalent to an age-independent process. For pShape != NULL (including when equal to one), pp will be considered a scale (= 1/rate), and rexp.var will draw a Weibull distribution instead of an exponential. This means Weibull(rate, 1) = Exponential(1/rate). Note that even when pShape != NULL, pp may still be time-dependent.

qShape

Similar to pShape, but for the extinction rate.

Note: Time-varying shape is implemented, so one could have pShape or qShape be a function of time. It is not thoroughly tested, however, so it may be prudent to wait for a future release where this feature is well established.

envPP

A data.frame representing the variation of an environmental variable (e.g. CO2, temperature, available niches, etc) with time. The first column of this data.frame must be time, and the second column must be the values of the variable. This will be internally passed to the make.rate function, to create a speciation rate variation in time following the interaction between the environmental variable and the function. Note paleobuddy has two environmental data frames, temp and co2. One can check RPANDA for more examples.

envQQ

Similar to envPP, but for the extinction rate.

pShifts

Vector of rate shifts. First element must be the starting time for the simulation (0 or tMax). It must have the same length as pp. c(0, x, tMax) is equivalent to c(tMax, tMax - x, 0) for the purposes of make.rate.

qShifts

Similar to qShifts, but for the extinction rate.

Note that using this method for step-function rates is currently slower than using ifelse.

nFinal

A vector of length 2, indicating an interval of acceptable number of species at the end of the simulation. Default value is c(0, Inf), so that any number of species (including zero, the extinction of the whole clade) is accepted. If different from default value, the process will run until the number of total species reaches a number in the interval nFinal. Please note that extOnly can modify the meaning of this parameter. If extOnly = TRUE, the function will run repeatedly until the number of extant species at the end of the simulation lies within the nFinal interval. Note that using values other than the default for nFinal might condition simulation results.

extOnly

A logical indicating whether nFinal should be taken as an interval of the number of total or extant species during the simulation. If TRUE, the function will run repeatedly until the number of extant species at the end of the simulation lies within the nFinal interval. If FALSE (as default), it will run until the total number of species generated lies within that interval.

Note: The function returns NA if it runs for more than 100000 iterations without fulfilling the requirements of nFinal and extOnly.

trueExt

A logical used for bd.sim.general, indicating whether it should return true or truncated extinction times. When TRUE, time of extinction of extant species will be the true time, otherwise it will be NA if a species is alive at the end of the simulation.

Value

The return list of either bd.sim.constant or bd.sim.general, which have the same elements, as follows

TE

List of extinction times, with 0 as the time of extinction for extant species.

TS

List of speciation times, with NA as the time of speciation for species that started the simulation.

PAR

List of parents. Species that started the simulation have NA, while species that were generated during the simulation have their parent's number. Species are numbered as they are born.

EXTANT

List of booleans representing whether each species is extant.

Author(s)

Bruno do Rosario Petrucci.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
# we will showcase here some of the possible scenarios for diversification,
# touching on all the kinds of rates

###
# consider first the simplest regimen, constant speciation and extinction

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 40

# speciation
p <- 0.11

# extinction
q <- 0.08

# run the simulation, making sure we have more than 1 species in the end
sim <- bd.sim(n0, p, q, tMax, nFinal = c(2, Inf))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim)
  ape::plot.phylo(phy)
}

###
# now let us complicate speciation more, maybe a linear function

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 40

# speciation rate
p <- function(t) {
  return(0.03 + 0.005*t)
}

# extinction rate
q <- 0.08

# run the simulation, making sure we have more than 1 species in the end
sim <- bd.sim(n0, p, q, tMax, nFinal = c(2, Inf))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  # full phylogeny
  phy <- make.phylo(sim)
  ape::plot.phylo(phy)
}

# what if we want q to be a step function?

# vector of extinction rates
qList <- c(0.09, 0.08, 0.1)

# vector of shift times. Note qShifts could be c(40, 20, 5) for identical results
qShifts <- c(0, 20, 35)

# let us take a look at how make.rate will make it a step function
q <- make.rate(qList, tMax = tMax, fShifts = qShifts)

# and plot it
plot(seq(0, tMax, 0.1), q(seq(0, tMax, 0.1)), type = 'l',
     main = "Extintion rate as a step function", xlab = "Time (My)",
     ylab = "Rate (species/My)")

# looking good, we will keep everything else the same

# maximum simulation time
tMax <- 40

# initial number of species
n0 <- 1

# speciation
p <- function(t) {
  return(0.02 + 0.005*t)
}

# a different way to define the same extinction function
q <- function(t) {
  ifelse(t < 20, 0.09, 
         ifelse(t < 35, 0.08, 0.1))
}

# run the simulation
sim <- bd.sim(n0, p, q, tMax, nFinal = c(2, Inf))
# we could instead have used qList and qShifts
# that is, however, much slower

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim)
  ape::plot.phylo(phy)
}

###
# we can also supply a shape parameter to try age-dependent rates

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 40

# speciation - here note it is a Weibull scale
p <- 10

# speciation shape
pShape <- 2

# extinction
q <- 0.08

# run the simulation
sim <- bd.sim(n0, p, q, tMax, pShape = pShape, nFinal = c(2, Inf))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim)
  ape::plot.phylo(phy)
}

### 
# scale can be a time-varying function as well

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 40

# speciation - here note it is a Weibull scale
p <- function(t) {
  return(2 + 0.25*t)
}

# speciation shape
pShape <- 2

# extinction
q <- 0.2

# run the simulation
sim <- bd.sim(n0, p, q, tMax, pShape = pShape, nFinal = c(2, Inf))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim)
  ape::plot.phylo(phy)
}

###
# finally, we can also have a rate dependent on an environmental variable,
# like temperature data

# get temperature data 
data(temp)

# initial number of species
n0 <- 1

# maximum simulation time
tMax <- 40

# speciation - a scale
p <- 10

# note the scale for the age-dependency can be a time-varying function

# speciation shape
pShape <- 2

# extinction, dependent on temperature exponentially
q <- function(t, env) {
  return(0.1*exp(0.01*env))
}

# need a variable to tell bd.sim the extinction is environmentally dependent
envQQ <- temp

# by passing q and envQQ to bd.sim, internally bd.sim will make q into a
# function dependent only on time, using make.rate
qq <- make.rate(q, envF = envQQ)

# take a look at how the rate itself will be
plot(seq(0, tMax, 0.1), qq(seq(0, tMax, 0.1)),
     main = "Extinction rate varying with temperature", xlab = "Time (My)",
     ylab = "Rate", type = 'l')

# run the simulation
sim <- bd.sim(n0, p, q, tMax, pShape = pShape, envQQ = envQQ,
              nFinal = c(2, Inf))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim)
  ape::plot.phylo(phy)
}

###
# one can mix and match all of these scenarios as they wish - age-dependency
# and constant rates, age-dependent and temperature-dependent rates, etc. The
# only combination that is not allowed is a vector rate and environmental
# data, but one can get around that as follows

# initial number of species
n0 <- 1

# speciation - a step function of temperature built using ifelse()
p <- function(t, env) {
  ifelse(t < 20, env,
         ifelse(t < 30, env / 4, env / 3))
}

# speciation shape
pShape <- 2

# environment variable to use - temperature
envPP <- temp

# this is kind of a complicated scale, let us take a look

# make it a function of time
pp <- make.rate(p, envF = envPP)

# plot it
plot(seq(0, tMax, 0.1), pp(seq(0, tMax, 0.1)),
     main = "Speciation scale varying with temperature", xlab = "Time (My)",
     ylab = "Scale", type = 'l')

# extinction - high so this does not take too long to run
q <- 0.2

# maximum simulation time
tMax <- 40

# run the simulation
sim <- bd.sim(n0, p, q, tMax, pShape = pShape, envPP = envPP,
              nFinal = c(2, Inf))

# we can plot the phylogeny to take a look
if (requireNamespace("ape", quietly = TRUE)) {
  phy <- make.phylo(sim)
  ape::plot.phylo(phy)
}

# note nFinal has to be sensible
## Not run: 
# this would return a warning, since it is virtually impossible to get 100
# species at a process with diversification rate -0.09 starting at n0 = 1
sim <- bd.sim(1, pp = 0.01, qq = 1, tMax = 100, nFinal = c(100, Inf))

## End(Not run)

brpetrucci/paleobuddy documentation built on Aug. 8, 2020, 2:03 a.m.