Sometimes, we may want to modify the environmental sequence of a stochastic IPM stored in PADRINO. As of now, there is no single interace for doing this, and so you may need to write some code on your own. This vignette is meant to provide a bit of help with that. It will make use of Rpadrino
and ipmr
to modify proto_ipm
s, and doesn't assume any prior knowledge of the data structure or of PADRINO itself (I don't think, anyway. Let me know if I'm incorrect via a Github Issue).
These aren't explicitly marked in the Metadata table. However, we can find them by searching the EnvironmentalVariables
table, specifically the ipm_id
column.
library(Rpadrino) pdb <- pdb_download(FALSE) unique(pdb$EnvironmentalVariables$ipm_id)
For now, we'll just work with the two IPMs from Westerband et al. (2016):
stoch_ids <- c("aaaa15", "aaaa16")
Ok, now we have the data we want to work with, we can check out various ways of modifying them to explore the questions we want to answer!
First, we'll examine how to re-set IPMs to sample from a pre-determined sequence of environmental values. This is useful for exploring things like, for example, environmental autocorrelation and/or enhancing reproducibility of our subsequent analyses.
This will require a few steps:
Create proto_ipm
objects from the PADRINO data.
Remove the current stochastic environmental state expressions
Replace them with something that will generate the values we want.
First, we should investigate what's in each model by creating and printing the proto_ipm
s.
# Step 1 - create proto_ipms, and then figure out which parameters in the # vital rate expressions we need to work with. protos <- pdb_make_proto_ipm(pdb, stoch_ids) protos[[1]]
It seems that j
and A_max
are the two that we need to create sequences for. Let's double check the EnvironmentalVariables
table to be sure:
pdb$EnvironmentalVariables[pdb$EnvironmentalVariables$ipm_id %in% stoch_ids, ]
library(knitr) kable(pdb$EnvironmentalVariables[pdb$EnvironmentalVariables$ipm_id %in% stoch_ids, ], caption = "EnvironmentalVariables Table, subsetted using 'stoch_ids'")
Next, we need to simulate values for j
and A_max
. This next chunk replicates what PADRINO is doing in a simulation, it just does the sampling ahead of time. In your own code, you'll want to replace it with, say, creating autocorrelated sequences of vectors, or a specific climate change scenario.
use_j <- sample(1:5, size = 50, replace = TRUE) use_a <- runif(50, 5, 8) # The names of this data.frame need to be the same as the names of the variables # we're substituting in the IPM. Otherwise, the function we write in the next # block won't work! use_env_data <- data.frame(j = use_j, A_max = use_a)
We now have our values! But how to insert them and then run the model? We need to write a function that samples these values in the order we want, and then sets their names correctly. It should return a named list. For this example, we'll assume that above, we created them in the correct order, and that each row of the resulting data.frame
corresponds to a time step. For now, don't worry about where index
will come from - that will become clear shortly.
sample_env <- function(env_data, index) { as.list(env_data[index, ]) %>% setNames(names(env_data)) }
We're ready to begin working with the actual proto_ipm
objects! For now, we'll just modify the first one. The first step is to eliminate the current environmental information from the proto_ipm
. That information is stored in the env_state
column. By default, it is NA
unless there is a continuous environmental variation, so we want to re-set it to that state before we insert our own information.
# For this demonstration, we'll create copy in the 'protos' object and overwrite # that. protos$aaaa15_2 <- protos$aaaa15 protos$aaaa15_2$env_state <- lapply( protos$aaaa15_2$env_state, function(x) return(NA) ) # In practice, you may just want to to overwrite the proto_ipm object in place. # We can still recover the object by re-building the proto_ipm list from the # intact version of the 'pdb' object. # protos$aaaa15$env_state <- lapply( # protos$aaaa15$env_state, # function(x) return(NA) # ) # If you want to do get rid of both at the same time, use a double-lapply: # protos <- lapply(protos, # function(x) { # x$env_state <- lapply(x$env_state, # function(y) return(NA)) # # })
Next, we're going to insert our environmantal information using ipmr
's define_env_state()
. In the call to sample_env()
, we set index = t
. t
is an internal variable that ipmr
defines to track what timestep of the simulation it is currently on. This will sample from the correct row of use_env_data
. We also supply the use_env_data
object and sample_env()
function in the data_list
slot so that they can be found when the simulation runs.
# Replace the environemntal set up with the one we just created using ipmrs' # define_env_state. protos$aaaa15_2 <- define_env_state( protos$aaaa15_2, env_vars = sample_env(use_env_data, index = t), data_list = list(use_env_data = use_env_data, sample_env = sample_env) )
We can rebuild the models now using pdb_make_ipm()
:
ipms <- pdb_make_ipm(protos) lambda(ipms)
cap_1 <- "Environmental covariate sequence from unaltered PADRINO data\n(first 10 iterations)" cap_2 <- "Environmental covariate sequence from custom function\n(first 10 iterations)" kable(head(ipms$aaaa15$env_seq, n = 10), caption = cap_1, format = "html", table.attr = "style='width:70%;'")
kable(head(ipms$aaaa15_2$env_seq, n = 10), caption = cap_2, format = "html", table.attr = "style='width:70%;'")
And finally, a quick check to verify that our pre-defined sequence is indeed the one that got used for aaaa15_2
:
all.equal(ipms$aaaa15_2$env_seq$j, use_env_data$j) all.equal(ipms$aaaa15_2$env_seq$A_max, use_env_data$A_max)
This is a bit simpler, because we just need to make some alterations to a table, rather than creating new expressions. Unfortunately, we can't use the parameters()<-
interface for environmental variation because that setter function does not modify the right place in the proto_ipm
object.
The env_variable
column of the EnvironmentalVariables
usually provides a very brief summary of what these parameters are, but not always. In general, PADRINO notation in the vr_expr_name
column mirrors the publication notation. So to understand the meaning of A_max
, we need to consult the original publication (Westerband, A. C., Horvitz, C. C. (2017). Photosynthetic rates influence the population dynamics of understory herbs in stochastic light environments. Ecology, 98 (2): 370-381). Upon doing this, we see that A_max
corresponds to photosynthetic capacity. For this instance, we'll pretend that we want to set the lower limit of A_max
to a smaller value, and the upper limit to a higher value.
knitr::kable(pdb$EnvironmentalVariables[pdb$EnvironmentalVariables$ipm_id %in% stoch_ids, ], caption = "EnvironmentalVariables table, subsetted using 'stoch_ids'")
We can see that A_max
is given by a Uniform distribution with parameters A_max_low
and A_max_high
. These two, as their names imply, are the lower and upper limits for A_max
, respectively. We can modify these just as we do any other data.frame
. We'll create a copy of the pdb
object to modify, but you can modify it in place and re-download an unaltered copy, or save the unaltered copy before altering it to preserve the original version. Again, we'll only modify aaaa15
.
pdb_2 <- pdb inds <- which(pdb_2$EnvironmentalVariables$vr_expr_name %in% paste0("a_max_", c("low", "high")) & pdb_2$EnvironmentalVariables$ipm_id == "aaaa15") pdb_2$EnvironmentalVariables$env_range[inds[1]] <- 3 # Bump min down two steps pdb_2$EnvironmentalVariables$env_range[inds[2]] <- 9 # Bump max up two steps # The rest is the usual building workflow: make a proto_ipm, then convert to # ipm. new_protos <- pdb_make_proto_ipm(pdb_2, "aaaa15") new_ipms <- pdb_make_ipm(new_protos) lambda(new_ipms)
What if we want to change the distribution that a certain environmental variable is sampled from? For example, rather than use a uniform distribution, we decide we want to sample it from a Gamma distribution. There are a couple steps we need to go through. They are:
Write a function that substitutes distribution names. We can find PADRINO's distribution naming convention in the "Abbreviation" column here.
Write a function to insert the new parameter values and names into the EnvironmentalVariables
Table.
Wrap 1-2 in a top-level function that we'll actually use.
Use it to modify EnvironmentalVariables
, then rebuild the IPMs using pdb_make_proto_ipm()
and pdb_make_ipm()
.
NB: The functions we define in 1-3 will probably be incorporated into
Rpadrino
at some point in the near future, but further testing for stability is needed before they become part of the package. As such, use them at your own risk!
The function in 1 will make use of rlang
to safely modify function names and arguments. It is possible to do this with gsub("Unif", "Gamma", pdb$EnvironmentalVariables$env_function[index])
(where index is the row number of the function you want to update), but could lead to unexpected results sometimes.
This is probably the most straightforward of the functions to write. We know that there two steps we have to implement that this wraps: substitution the functioal form of the distribution, and substituting the parameter values. Additionally, we're going to insert a subsetting capacity so that we can pass a larger database loop over a set of ipm_id
s to rapidly update many models at once.
library(rlang) sub_env_fun <- function(db, # pdb object parameter, # parameter created by the distribution new_fun, # The new distribution new_pars, # The parameters for the new distribution id = NULL) { # ipm_id to operate on if(!is.null(id)) { if(length(id) > 1L) stop("'id' should only have length 1!") db <- pdb_subset(db, id) } ev_tab <- db$EnvironmentalVariables %>% sub_rng_fun(parameter, new_fun, new_pars) %>% sub_rng_pars(parameter, new_pars) db$EnvironmentalVariables <- ev_tab return(db) }
The first substituting function we'll write is the one that substitutes the distribution we wish to use into the env_function
column. This will use the names of the new_pars
object as arguments in a call to the new_fun
function. Don't worry about finding the correct R
r<distribution>
function here - we use PADRINO's syntax to modify these.
sub_rng_fun <- function(ev_tab, parameter, new_fun, new_pars) { # Get the old function form old_fun_ind <- which(ev_tab$vr_expr_name == parameter) # Create a new function call, and insert new arguments arg_nms <- syms(names(new_pars)) new_fun <- call2(new_fun, !!! syms(names(new_pars))) # Convert back to a string and insert into the table ev_tab$env_function[old_fun_ind] <- expr_text(new_fun) return(ev_tab) }
There is a lot going on in this function, and most of it is beyond the scope of this vignette (if you're curious about the details, see the Metaprogramming
topics here). The key takeaway is that we take our new function, "Gamma"
and convert it into a function call with arguments that are named after new_pars
. Then we stick it back into the table, overwriting the old "Unif"
function.
Adding a parameter value to the table is far less complicated. First, we remove the parameters associated with the old distribution. We have to do this because of the way that Rpadrino
finds the parameters to functions in the EnvironmentalVariables
table. The first lines of the function accomplish this. After that, we create a version of the EnvironmentalVariables
table that contains our new parameters and add those back in.
sub_rng_pars <- function(ev_tab, parameter, new_pars) { # Remove the existing parameters associated with the distribution # that we've replaced. if(sum(ev_tab$env_variable == parameter) > 1) { rm_ind <- ev_tab$env_variable == parameter & ev_tab$model_type == "Parameter" ev_tab <- ev_tab[!rm_ind, ] } new_rows <- data.frame( ipm_id = rep(unique(ev_tab$ipm_id), length(new_pars)), env_variable = rep(parameter, length(new_pars)), vr_expr_name = names(new_pars), env_range = unlist(new_pars), env_function = rep("NULL", length(new_pars)), model_type = "Parameter" ) rownames(new_rows) <- NULL out <- rbind(ev_tab, new_rows) return(out) } small_pdb <- pdb %>% pdb_subset("aaaa15") new_pdb <- sub_env_fun(db = small_pdb, id = NULL, "A_max", "Gamma", list(a_max_shape = 14, a_max_rate = 3.5))
kable(new_pdb$EnvironmentalVariables, caption = "New 'EnvironmentalVariables' table")
Notice that our new Gamma distribution parameters have found their way into the parameter table. We can now rebuild the model as using the same pipeline as above.
new_ipm <- new_pdb %>% pdb_make_proto_ipm() %>% pdb_make_ipm() lambda(new_ipm)
It is probably ok to copy/paste the functions we defined above and apply to in your own analyses. However, as they are not yet incorporated into Rpadrino
, I cannot promise that the results will be error free.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.