In ggcostoya/tpcurves2: T-P Curves

knitr::opts_chunk$set(echo = TRUE)
devtools::load_all()

In this page I show the basic logic behind building and running simulations based on TPD. A simulation will mimic the evolution of a natural population through time, accounting for survival of existing individuals and the generation of new individuals through reproduction. A simulation is also a simplification of reality, in this case, I consider evolution to be driven only by a single factor, temperature, determining a single trait performance, which in turn conditions survival and reproduction.

Although the main goal of a simulation is to mimic evolution long-term, its fundamental unit is the change between two moments in time, henceforth called $t$ and $t+1$. Here, I'll first present in detail two functions to simulate survival and reproduction and, later, how they work in conjunction to simulate a full $t$ to $t+1$ interval.

Simulate Survival

The function sim_surv provided in this package, simulates survival in a population from a $t$ to a $t+1$. If a TPD is provided, the function returns a new TPD with surviving individuals. Below, I'll show first how it works, then I'll provide an example and last I'll briefly go through potential improvements.

How it works

The function sim_surv works following these steps:

Select an individual and extract its' TPD
Draw a TPC from the individual's TPD using the function get_tpc.
From the individual's TPC extract it's best performance ($P_{max}$).
From the individual's TPC extract $P_{T_e}$ the performance at the current environmental temperature ( $T_{e}$ ). Being $T_{e}$ one of the parameters that need provided to the sim_surv function.
Define $P_{r}$ the individual's relative performance to the maximum at $T_{e}$ i.e. $P_r = P_{T_{e}} / P_{max}$. In optimal conditions an individual's $P_{r}$ will = 1.
Define the individual's probability of survival, $S$, which is equal to the product of $P_{r}$ and $Sp$. I consider $Sp$ , which also needs to be provided to sim_surv, as the relationship between performance and survival and it needs to be always <1. Let's say that an individual's $P = P_{max}$, therefore, it's $P_{r} = 1$. If we consider an $Sp = 0.9$ then $S = 0.9$ for that individual. I assume that no matter how good is your performance your survival can't be guaranteed.
Using $S$ as the probability, randomly sample from a binomial distribution of size = 1 to determine survival. If 1, the individual survives, if 0 the individual doesn't survive between $t$ and $t+1$ (Note: That does not mean it can't reproduce, more on that later).
If the individual survives, include it's TPD to a new survivor population TPD
Repeat steps 1 - 9 for all individuals of the population at time $t$ to get the TPD of the survivors at $t+1$.

I also include Degree and Pmin among the arguments of the function and like in many other cases in this package, they are used to draw custom curves. The detailed code for sim_surv can be found on the repository portal at R/sim_surv.R.

Example

Below I generate a population's TPD and I simulate the survival of some of its individuals. In this example I am generating an $N = 50$ individual population with a $T_{opt} = 25$, $CT_{max}=30$ and $CT_{min} = 20$. I'm using as $T_{e} = 29$ degrees and I am considering that the relationship between P and Survival is of $Sp = 0.8$ .( Note : I also use $P_{max} = 10$ & $P_{min} = 0$ but these are arbitrary values, the y axis is meant to represent a hypothetical PIT).

# generate original population
pop_t <- gen_pop_tpd( N = 50, Topt = 25, CTmax = 30,CTmin = 20, Pmax = 10, Pmin = 0, 
                      Error = 3, Samples = 10, Degree = 3)

# simulate survival to generate population at t+1
survivors_t1 <- sim_surv(TPD = pop_t, Te = 29, Sp = 0.9, Degree = 3, Pmin = 0)

When plotted, this the TPD and the curves for the original population and survivors would look like:

# preping the data for plotting 

# drop the parentage column in survivors_t1 data
survivors_t1 <- survivors_t1[,-c(4)]

# assign identifiers to both populations 
pop_t <- pop_t %>% mutate(Pop = rep("Original", nrow(pop_t)))
survivors_t1 <- survivors_t1 %>% mutate( Pop = rep("Survivors", nrow(survivors_t1)))

# join TPDs  
ps_tt1 <- rbind(pop_t, survivors_t1)

# extract TPCs
pop_t_tpc <- get_tpc(Tm = pop_t$Tm, P = pop_t$P, Degree = 3, Pmin = 0) %>% mutate(Pop = rep("Original"))
survivors_t1_tpc <- get_tpc( Tm = survivors_t1$Tm, P = survivors_t1$P, Degree = 3, Pmin = 0) %>% mutate(Pop = rep("Survivors"))

# join TPCs
ps_tt1_tpc <- rbind(pop_t_tpc, survivors_t1_tpc)

# plots 

# TPD plot
ggplot(data = ps_tt1_tpc, aes( x = Tm, y = P)) +
  geom_ribbon(aes(ymin = L, ymax = U, fill = factor(Pop)), alpha = 0.25) +
  geom_line(aes(colour = factor(Pop)), lwd = 1.25) +
  geom_point(data = ps_tt1, aes(x = Tm, y = P, colour = factor(Pop))) +
  theme_classic() +
  labs( x = "Temperature", y = "Performance", fill = "Population", colour = "Population") +
  theme(legend.position = "top") +
  geom_hline( aes( yintercept = 0), lty = 2) +
  scale_fill_manual(values=c("grey", "royalblue")) +
  scale_color_manual(values=c("grey", "royalblue"))

# change in population size 
n_t <- pop_t %>% nest(tpd = c(Tm,P)) %>% nrow() %>% as.numeric()
n_t1 <- survivors_t1 %>% nest(tpd = c(Tm, P)) %>% nrow() %>% as.numeric()
Population_sizes <- data.frame(Time = c("t", "t + 1"), N = c(n_t, n_t1))
Survival_rate <- Population_sizes$N[2]/Population_sizes$N[1]

# comparison in TPTs 
nt_tpts <- get_tpts(Tm = pop_t$Tm, P = pop_t$P, Degree = 3, Pmin = 0) %>% mutate(Population = rep("Original")) %>%
  spread(trait,value = value)
nt1_tpts <- get_tpts(Tm = survivors_t1$Tm, P = survivors_t1$P, Degree = 3, Pmin = 0) %>% mutate(Population = rep("Survivors")) %>% spread(trait, value = value)
TPTs <- rbind(nt_tpts, nt1_tpts)

After just one $t$ to $t+1$ period, the shape of the population's TPC has changed noticeably. As shown below, all those TPT related to temperature tolerance have shifted, most notably $T_{opt}$ has shifted upwards due to the selection imposed by a higher $T_{e}$. Notice that the other temperature TPTs have also shifted; Most likely $CT_{max}$ and $CT_{min}$ have most likely moved further away from the optimal making the $T_{br}$ wider.

Furthermore, $P_{max}$ has also changed. There are two possibilities to explain it: First, if it has increased, that would mean that individual having an already higher $P_{max}$ were been more likely to survive. Second, if it decreased, that would mean that individual's with wider curves in which $P$ descends more slowly as it approaches the $CTs$ have survived.

TPTs

However, a larger population $P_{max}$ does not reflect a better population state. Considering the initial $T_{opt} = 25$ and $CT_{max} = 30$, a $T_{e} = 29$ is a quite significant selective pressure very close to the upper temperature tolerance limit. That is why, after one period the population size has significantly diminished and the survival rate has been quite low.

Population_sizes

Survival_rate

(* I emphasize on "most likely" wording since the simulation outcome is different every time and subject to stochastic variation)

Potential Improvements

Things that could improve the sim_surv function:

Including age as a new variable and make the relationship between performance and survival ($Sp$) also dependent on it.
Including other covariates influencing $Sp$ in order to make the simulation more realistic.
What are realistic $Sp$ values?
Perhaps the amount of change due to mutation should not be a percentage but a fixed amount.

Simulate Reproduction

The function sim_repr also provided here simulates reproduction in a population from $t$ to $t+1$. If a population's TPD is provided, the function returns a new TPD is returned with new individuals. It is important to mention that for now I am following an important assumption: All individuals in the population are females. This means that individuals do not have to mate and that reproduction is exclusively. More on this on the potential improvements section.

Below, I'll follow the same principle as in the Simulate Survival section; how it works, an example and potential improvements

How it works

The function sim_repr works following these steps, many of which are shared with sim_surv:

Select an individual and extract its' TPD
Get the individual's TPTs from its' TPD using the function get_tpts.
Draw a TPC from the individual's TPD using the function get_tpc.
From the individual's TPC extract $P_{T_e}$ the performance at the current environmental temperature ( $T_{e}$ ). Being $T_{e}$ also, one of the parameters that need provided to the sim_repr.
Define $P_{r}$ the individual's relative performance to the maximum at $T_{e}$ i.e. $P_r = P_{T_{e}} / P_{max}$.
Define the individual's probability of reproduction, $R$, which is equal to the product of $P_{r}$ and $Rp$. I consider $Rp$ , which also needs to be provided to sim_repr, as the relationship between performance and reproduction. Differently to $Sp$ in sim_surv, in can be > 1. For example, $Rp = 1.5$ would mean that an individual with $P_{r} = P{max}$ would have a 150% probability of having offspring, i.e. it is guaranteed it will have 1 offspring and it has a 50% chance of having a second one.
Determine the individual's offspring output based on $R$. The output is the sum of guaranteed and non-guaranteed offspring. To determine guaranteed offspring $R$ is rounded down into smallest whole number. Non-guaranteed offspring is calculated by using the difference between $R$ and rounded $R$ as the probability value into a sample and size = 1 binomial distribution from which either a 0 or a 1 is obtained.
For each offspring determine its base TPTs by copying those of the parent saved in step 2.
Determine in which TPTs are mutations occurring. This is done by taking as many samples as TPTs are (usually 5) from a binomial distribution of size 1 and with probability equal to $\mu$, the mutation rate, another argument that needs to be provided to sim_repr. To keep things realistic, $\mu$ should not be > 0.01
For those TPTs that mutate, determine the new TPT value. This is done by adding or subtracting the current TPT value and a given amount. This amount is determined by sampling a value from a normal distribution where the mean is the individual's TPT times $\mu m$ and the SD is the TPT times $\mu sd$. Both $\mu m$ and $\mu sd$ are arguments of sim_repr and both are in units of %. These two arguments determine by how much can a trait mutate. As common practice I've been using 5% for $\mu_m$ and around 1% for $\mu_sd$. I am not sure by how much can a trait change over generations by I don't think it is wise to let a trait change to any other arbitrary value since it is not biologically realistic.
Using the offspring's TPTs (whether they have mutated or not) use the gen_tpd function to generate some random TPD for it. As part of this process, Degree, Pmin, Samples and Error need to be specified so they are also included among the arguments of sim_repr.
Provide a new identifier for the offspring using gen_id.
Include the offspring's data into the offspring population at $t+1$ TPD.
Repeat steps 8 - 13 for all offspring an individual has.
Repeat steps 1 - 13 for all individuals alive in the population at $t$.

The detailed code for sim_repr can be found on the repository portal at R/sim_repr.R.

Example

Below I use the population I generated earlier to simulate the reproduction of some of its individuals. To simulate new offspring I use the sample $T_{e} = 29$ as before and an $Rp = 0.5$.

# simulate reproduction to generate offspring at t+1
offspring_t1 <- sim_repr(TPD = pop_t, Te = 29, Rp = 0.5, Mu = 0.01, Mum = 0.05, Musd = 0.01, 
                         Degree = 3, Pmin = 0, Samples = 10, Error = 0.1 )

When plotted, this the TPD and the curves for the original population and offspring would look like:

# prerping data for plotting

#drop parentage column in offspring 
offspring_t1 <- offspring_t1[,-c(4)]

# assign identifiers to offspring population 
offspring_t1 <- offspring_t1 %>% mutate( Pop = rep("Offspring", nrow(offspring_t1)))

# join TPDs
pr_tt1 <- rbind(pop_t, offspring_t1)

# extract TPC from offspring population
offspring_t1_tpc <- get_tpc(Tm = offspring_t1$Tm, P = offspring_t1$P, Degree = 3, Pmin = 0) %>% mutate(Pop = rep("Offspring"))

# join TPCs
pr_tt1_tpc <- rbind(pop_t_tpc, offspring_t1_tpc)

# plots
ggplot(data = pr_tt1_tpc, aes( x = Tm, y = P)) +
  geom_ribbon(aes(ymin = L, ymax = U, fill = factor(Pop)), alpha = 0.25) +
  geom_line(aes(colour = factor(Pop)), lwd = 1.25) +
  geom_point(data = pr_tt1, aes(x = Tm, y = P, colour = factor(Pop))) +
  theme_classic() +
  labs( x = "Temperature", y = "Performance", fill = "Population", colour = "Population") +
  theme(legend.position = "top") +
  geom_hline( aes( yintercept = 0), lty = 2) +
  scale_fill_manual(values=c("red", "grey")) +
  scale_color_manual(values=c("red", "grey"))

# population sizes
n_t1 <- offspring_t1 %>% nest(tpd = c(Tm, P)) %>% nrow() %>% as.numeric()
Population_sizes <- data.frame(Time = c("t", "t + 1"), N = c(n_t, n_t1))
Reproductive_rate <- Population_sizes$N[2]/Population_sizes$N[1]

# TPTs comparison
nt1_tpts <- get_tpts(Tm = offspring_t1$Tm, P = offspring_t1$P, Degree = 3, Pmin = 0) %>% mutate(Population = rep("Offspring")) %>% spread(trait, value = value)
TPTs <- rbind(nt_tpts, nt1_tpts)

Comparing the TPTs of the original and the offspring population we can see that there has also been change. Following the same logic as with survival, we would expect the offspring's $T_{opt}$ to shift upwards with respect to parents. Individual's with already higher $T_{opt}$ would have better performance at $T_e$ and thus more likely to reproduce. Again, this is just speculation that should be followed by a multiple generation simulation.

TPTs

Also, as $Rp$ has been set really low at 0.5 which translates into a very small offspring output and only a small % of the population actually reproducing.

Population_sizes

Reproductive_rate

Potential Improvements

Include different sexes.
Add mating chance as another covariate determining reproduction. Maybe mating change can also be linked to performance.
Do some research on by how much can different TPTs can change if they mutate. Perhaps an average of 5% with a variation of 1% is not that realistic.
TPTs being defined by a function conditioned by age?
What are realistic $Rp$ values?

Putting it together; Simulate t to t + 1

Compiling together sim_surv and sim_repr, this package includes the function sim_time, able to simulate survival and reproduction between two moments in time. sim_time takes all arguments from the functions that make it and returns a TPD with survivors and offspring combined. Below I show a quick example using it:

# clean the "Pop" column on pop_t 
pop_t0 <- pop_t[,-c(4)]

pop_t1 <- sim_time(TPD = pop_t0, Te = 29, Sp = 0.9, Rp = 0.5, Mu = 0.01, Mum = 0.05, Musd = 0.01, 
                   Degree = 3, Pmin = 0, Samples = 10, Error = 0.1)

# plotting 

#drop parentage column  
pop_t1 <- pop_t1[,-c(4)]

# assign identifiers to offspring population 
pop_t1 <- pop_t1 %>% mutate( Pop = rep("Next", nrow(pop_t1)))

# join TPDs
pp_tt1 <- rbind(pop_t, pop_t1)

# extract TPC from offspring population
pop_t1_tpc <- get_tpc(Tm = pop_t1$Tm, P = pop_t1$P, Degree = 3, Pmin = 0) %>% mutate(Pop = rep("Next"))

# join TPCs
pp_tt1_tpc <- rbind(pop_t_tpc, pop_t1_tpc)

# plots
ggplot(data = pp_tt1_tpc, aes( x = Tm, y = P)) +
  geom_ribbon(aes(ymin = L, ymax = U, fill = factor(Pop)), alpha = 0.1) +
  geom_line(aes(colour = factor(Pop)), lwd = 1.25) +
  geom_point(data = pp_tt1, aes(x = Tm, y = P, colour = factor(Pop))) +
  theme_classic() +
  labs( x = "Temperature", y = "Performance", fill = "Population", colour = "Population") +
  theme(legend.position = "top") +
  geom_hline( aes( yintercept = 0), lty = 2) +
  scale_fill_manual(values=c("purple", "grey")) +
  scale_color_manual(values=c("purple", "grey"))

# data comparison
 n_t1 <- pop_t1 %>% nest(tpd = c(Tm, P)) %>% nrow() %>% as.numeric()
Population_sizes <- data.frame(Time = c("t", "t + 1"), N = c(n_t, n_t1))
Population_growth <- Population_sizes$N[2]/Population_sizes$N[1]

# TPTs comparison
nt1_tpts <- get_tpts(Tm = pop_t1$Tm, P = pop_t1$P, Degree = 3, Pmin = 0) %>% mutate(Population = rep("Next")) %>% spread(trait, value = value)
TPTs <- rbind(nt_tpts, nt1_tpts)