title: "morse
: an R-package to analyse toxicity test data"
tags:
- R
- ecotoxicology
- dataviz
- dose-response
- GUTS
- TKTD
- Bayesian inference
- Survival
- Reproduction
date: "2021-04-07"
output:
pdf_document:
extra_dependencies: ["float"]
authors:
- name: Virgile Baudrot
affiliation: 1
- name: Sandrine Charles
orcid: 0000-0003-4604-0166
affiliation: 1, 2
affiliations:
- name: University of Lyon, University Lyon 1, CNRS UMR5558, Laboratory de Biometry and Evolutionary Biology, 69100 Villeurbanne, France.
index: 1
- name: Corresponding author (email)
index: 2
bibliography: paper.bib
The morse
package is used for the analysis of experimental data collected from standard toxicity tests. It provides ready-to-use functions to visualize a dataset and to estimate several toxicity indices to be further used in support of environmental risk assessment in full compliance with regulatory requirements [@OECD2006]. Such toxicity indices are indeed typically requested by standardized regulatory guidelines on which national agencies base their evaluation of applications for regulatory approval of chemical substances; see for example OECD toxicity testing guideline [@OECD2016; @OECD2012]. Tools gathered together within the morse
package involve the most advanced and innovative methods developed for ecotoxicology, such as appropriate stochastic parts in dose-response modelling [@MLDM2014], and the use of Bayesian statistics to get parameter estimates as posterior probability distribution [@Billoir2008]. Hence, these tools are easily accessible for ecotoxicologists and regulators who do not need to deeply invest in the underlying technicalities in order to perform a valuable quantitative environmental risk assessment.
This paper provides an overview of a typical use of the morse
package with survival data collected over time and at different increasing exposure concentrations. These data are analysed with the reduced version of GUTS models based on the stochastic death hypothesis (namely, the GUTS-RED-SD
model). This example can be followed, step-by-step, to analyse any new dataset, as long as the dataset format is respected.
The morse
package can be used to get estimates of $LC_x$ ($x$\% Lethal Concentration) or $EC_x$ ($x$\% Effective Concentration) by fitting standard exposure-response or effect models on toxicity test data. Toxicity indicator estimates as well as model parameters are provided along with the quantification of their uncertainty. The morse
package can also be used to get estimates of the $NEC$ (No Effect Concentration) by fitting a toxicokinetic-toxicodynamic (TKTD) model (namely GUTS
models, that is General Unified Threshold models of Survival). Using GUTS
models also allows a user to get estimates of $LC_{(x,t)}$ (whatever $x$ and $t$) and $LP_{(x,t)}$, the latter being defined by EFSA as the $x$\% multiplication factor leading to an additional reduction of $x$\% in survival at the end of the exposure profile. Above all, GUTS
models can be used on data collected under time-variable exposure profiles.
The morse
package [@morse2021] has been tested using R
(version 3.5 and later) on macOS, Linux and Windows machines. Regarding the particular case of TKTD models for survival, namely GUTS models, the morse
package was ring-tested together with nine other GUTS
implementations under different software platforms; see Appendix A in @Jager2018 for details. Note that we were the only team to perform the ring-test under a Bayesian framework. Other implementations were \texttt{BYOM}, \texttt{DEBTOX}, \texttt{DELPHI}, \texttt{EPYTOX}, \texttt{GUTS-3S}, \texttt{MATHEMATICA}, \texttt{MODELMAKER}, \texttt{OPENMODEL}, \texttt{EASYGUTS} and \texttt{GATEAUX}, all under a frequentist statistical framework. All participants to the ring-test received the same datasets and tasks, carried out their simulations independently from each other and sent the results back to the coordinator for analysis. Output estimation was similar to other implementations, therefore, package morse
was confirmed as fit-for-purpose in fitting GUTS models on survival toxicity test data.
All functions in the morse
package can be used without a deep knowledge of their underlying probabilistic model or inference methods. Rather, they are designed to behave as well as possible, without requiring the user to provide values for some parameters. Nevertheless, models implemented in morse
can also be used as a first step to tailor new models for more specific situations.
Note that morse
benefits from a web interface, MOSAIC, from which the same analyses can be reproduced directly on-line without having to implement them directly in R
programming. MOSAIC is freely available at https://mosaic.univ-lyon1.fr/ [@Charles2017] (Figure \ref{fig:mosaic}).
The morse
package is available as an R
package; it can be directly downloaded from CRAN https://CRAN.R-project.org/package=morse, where package dependencies and system requirements are also documented. The development version can be found on GitHub https://github.com/pveber/morse, where code contributions, bug reports, fixes and feature requests are more than welcome by opening issues and pull requests.
The main functions in the morse
package are survData()
, reproData()
, and plotDoseResponse()
to visualize raw data. The survFitTT()
, reproFitTT()
, and survFit()
functions allow a user to fit a model on data in order to estimate toxicity indicators, the choice of which depends on the type of data. Fitting outputs can either be displayed with plot()
or synthesized with summary()
. Functions are available to check the goodness-of-fit, namely ppc()
and plot_prior_post()
. Predictions can be performed with predict()
, predict_ode()
, predict_Nsurv()
, and predict_Nsurv_ode()
. Finally, function LCx()
and MFx()
allow a user to get $x$\% lethal concentrations or profiles, respectively.
The morse
package currently handles binary and count data, for example survival and reproduction data. Functions dedicated to binary (respectively, count) data analysis start with a surv
(respectively, repro
) prefix. morse
provides a similar workflow in both cases:
In addition, for binary data handled with GUTS
models, the morse
package also allows a user to:
See @EFSA2018 for details.
Those steps are presented in depth in the Tutorial
available at https://cran.r-project.org/web/packages/morse/vignettes/tutorial.html, with all necessary details to implement all morse
features. A more formal description of the models and the estimation procedures are provided in a document called "Models in morse package" available at https://cran.r-project.org/web/packages/morse/vignettes/modelling.pdf. Please refer to this documentation for further introduction to the use of the morse
package.
The morse
package is linked to JAGS http://mcmc-jags.sourceforge.net/, a Bayesian sampler used to perform inference with all implemented models. So, you need to download and install JAGS at https://sourceforge.net/projects/mcmc-jags/. Then you must test that your R
graphical user interface has access to JAGS, and, if not, to specify where JAGS can be found on your computer. Indeed, once installed, JAGS can be lost in the PATH. To help solve this issue, you can use package runjags
which is installed and loaded as follows.
### install the `runjags` package, if needed
if(is.element('runjags', installed.packages()[,1]) == FALSE){
install.packages('runjags')
}
### load the `runjags` package
library("runjags")
### run test
testjags()
The output should look like this:
You are using R version 4.0.2 (2020-06-22) on a windows machine, with the RStudio GUI
JAGS version 4.3.0 found successfully using the command
'C:/Program Files/JAGS/JAGS-4.3.0/x64/bin/jags-terminal.exe'
The rjags package is installed
Otherwise, you can specify to your R
graphical user interface where the JAGS executable is located in your computer (somewhere in 'C:/Program Files/JAGS/JAGS-4.3.0/x64/bin/jags-terminal.exe'
on Windows machines):
testjags(jags=runjags.getOption('jagspath'))
### replace `jagspath` by your own PATH to JAGS
### For instance
### 'C:/Program Files/JAGS/JAGS-4.3.0/x64/bin/jags-terminal.exe'
morse
and its dependenciesIn order to use the morse
package, you need to install it with all its dependencies on other R-packages: mandatory ones (coda
, deSolve
, dplyr
, epitools
, graphics
, grDevices
, ggplot2
($\geqslant$ 2.1.0), grid
, gridExtra
, magrittr
, methods
, reshape2
, rjags
($\geqslant$ 4.0), stats
, tibble
, tidyr
, zoo
) and suggested ones (knitr
, rmarkdown
, testthat
). For this purpose, you can use the two classical R
commands:
### install the `morse` package, if needed
if(is.element('morse', installed.packages()[,1]) == FALSE){
install.packages('morse')
}
### load the `morse` package
library(morse)
Note that the morse
package is also linked to C++. C++ is used for running simulations leading to predictions. In R
, you should not have issues with C++ requirements since it is very well integrated (many R
functions are simple interfaces to C++ functions). Feel free to report any trouble at https://github.com/pveber/morse/issues by opening a new issue for the morse
package.
We assume hereafter that morse
and all the above dependencies have been corrected installed. To illustrate the use of morse
, we will use a standard survival dataset coming from a chronic laboratory toxicity test with Gammarus pulex, a freshwater invertebrate, exposed to increasing concentrations of propiconazole (a fungicide) during four days. Eight concentrations were tested with two replicates of 10 organisms per concentration. Survival was monitored at five time points (at day 0, 1, 2, 3 and 4) [@Nyman2012].
We will use the reduced version of the GUTS model based on the stochastic death hypothesis (namely, the GUTS-RED-SD
model), as recommended by the European Food Safety Authority (EFSA) for the environmental risk assessment (ERA) of plant protection products potentially toxic for aquatic living organisms [@EFSA2018]. This model can also be fitted on-line with the MOSAIC web platform [@Baudrot2018b]. Below is the modus operandi with the morse
package to be followed step-by-step in order to be in full compliance with the EFSA workflow for ERA [@EFSA2018].
### load package `morse`
library(morse)
### load a dataset
data("propiconazole")
### create a morse object for binary data analysis
survData_PRZ <- survData(propiconazole)
### fit a reduced GUTS model (GUTS-RED) with option "SD" (Stochastic Death)
fit_cstSD <- survFit(survData_PRZ, model_type = "SD")
### plot the fitting result
plot(fit_cstSD)
Using a GUTS model with morse
allows the user to get a probability distribution on the $x$\% lethal concentration whatever the exposure duration $t$, namely the $LC_{(x,t)}$. By default, $t$ corresponds to the last time point in the dataset and $x = 50$\%.
### run function LCx()
LCX_cstSD <- LCx(fit_cstSD, X = 50)
### plot the output as a concentration-response curve
plot(LCX_cstSD)
Validation consists in predicting the number of survivors over time under pulsed-exposure profiles for which observations have also been collected. Predictions are then compared to observations and their adequacy is checked according to several validation criteria defined by EFSA [@EFSA2018]. The aim of this step is to choose an appropriate model for the following step.
### load data collected under pulsed exposure profiles
data("propiconazole_pulse_exposure")
### predict the number of survivors for all profiles
predict_Nsurv <- predict_Nsurv_ode(
object = fit_cstSD,
data_predict = propiconazole_pulse_exposure
)
### plot results
plot(predict_Nsurv)
Once the predictions are visually checked (Figure \ref{fig:nsurv}), quantitative validation criteria need to be calculated.
### check for adequacy between predictions and observations
predict_Nsurv_check(predict_Nsurv)
check <- predict_Nsurv_check(predict_Nsurv)
check$Percent_PPC_global
check$Percent_NRMSE_global
This reveals that, in total, 84\% of the observations lie within the uncertainty band of the predictions, while the global variability of data around the predictions is 16.2\%. For both criteria, a maximum value of 50\% is expected, what means here that we do not expect specific risk for the species and the chemical compound under consideration.
Risk assessors are interested in testing various exposure scenarios, having a certain environmental realism that is varying over time. Risk assessors expect to evaluate the potential impact of these profiles on survival of target species to protect. Typically, they want to compute the multiplication factor $MF_{(x,t)}$ that could be applied to the exposure profile without reducing more than by $x$\% the survival probability at a specified test duration $t$ (default being the last time point of the exposure profile). This is the so-called $x$\% lethal profile, denoted $LP_x$, and newly proposed by @EFSA2018. This calculation is provided by function MFx()
in morse
.
The mathematical definition of the $x$\% Multiplication Factor at time $t$ (at the end of a time series $T = {0, …, t}$) is given by:
$$S\left(MF_{(x,t)} \times C_w(\tau \in T), t\right) = S( C_w(\tau \in T), t) \times \left(1-\frac{x}{100}\right)$$
where $C_w(\tau \in T)$ is the original exposure profile, and expression $S\left(MF_{(x,t)} \times C_w(\tau \in T), t\right)$ the survival probability after the exposure profile has been translated upward by a multiplication $MF_{(x,t)}$; the new exposure profile thus becomes equal to $MF_{(x,t)} \times C_w(\tau \in T)$.
### define an exposure profile (here a theoretical one)
data_4MFx <- data.frame(time = 1:10,
conc = c(0,0.5,8,3,0,0,0.5,8,3.5,0))
### run function MFx()
MFx_PRZ_cstSD <- MFx(object = fit_cstSD, data_predict = data_4MFx, ode = TRUE)
### plot the survival probability at the end of the exposure profile
### according to a range of multiplication factors (log-scale)
plot(MFx_PRZ_cstSD, log_scale = TRUE)
Finally, it may be useful to predict the survival probability under any exposure profile (time-variable or not), for example when designing new experiments or to better understand what happens in field. Below are some examples that you can use to build your own simulations.
### define an exposure profile (here a theoretical one)
### note that here you need to specify a third column `replicate`
data_example <- data.frame(
time = c(1,1.9,2,15,15.1,20),
conc = c(0,0,20,20,0,0),
replicate = rep("Basic example", 6)
)
### perform basic prediction
predict_example_NULL <- predict_ode(
object = fit_cstSD,
data_predict = data_example,
mcmc_size = 10,
interpolate_length = NULL)
### plot the result for only few exposure time points
plot(predict_example_NULL)
### define the same basic exposure profile
### but by changing the `replicate` value
data_example <- data.frame(
time = c(1,1.9,2,15,15.1,20),
conc = c(0,0,20,20,0,0),
replicate = rep("Basic example (interpolation)", 6)
)
##### perform prediction with interpolation of the exposure profile
predict_example_100 <- predict_ode(
object = fit_cstSD,
data_predict = data_example,
mcmc_size = 10,
interpolate_length = 100)
# plot the result
plot(predict_example_100)
### load an environmentally realistic profile
data("FOCUSprofile")
FOCUSprofile[,"replicate"] <- "FOCUS example"
### perform prediction
predict_FOCUS <- predict_ode(
object = fit_cstSD,
data_predict = FOCUSprofile,
mcmc_size = 10,
interpolate_length = NULL)
### plot the result
plot(predict_FOCUS)
morse
The morse
package was recently used to evaluate the added value of using TKTD models in comparison with classical dose-response models, based on a case study with the snail Limnaea stagnalis when exposed to increasing concentrations of cadmium [@baudrot2018a]. Also based on the morse
package, we proposed some recommendations to address TKTD assessment using uncertainties in environmental risk models [@Baudrot2019].
More recently, benefiting from our experience in developing the morse
package for ecotoxicology, we strongly contributed to the new rbioacc
package https://CRAN.R-project.org/package=rbioacc, a turn-key package providing bioaccumulation metrics (BCF/BMF/BSAF) from a toxicokinetic (TK) model fitted to accumulation-depuration data. The rbioacc
package also supports the MOSAIC$_{bioacc}$ web platform https://mosaic.univ-lyon1.fr/bioacc. Last but not least, the 'rDEBtktd' package is currently under development https://gitlab.in2p3.fr/sandrine.charles/rDEBtktd to complement the morse
package in order to fit, validate, and predict with TKTD models simultaneously describing growth and reproduction dynamics of living organisms under chemical pressure.
A collection of eight datasets is made available directly in the morse
package (use function data()
). These datasets can also be downloaded online from the MOSAIC web platform by visiting the different modules: https://mosaic.univ-lyon1.fr.
V.B. (main developer of morse
): conceptualization, methodology, formal analysis, data curation, visualization, writing manuscript. S.C.: supervision, funding acquisition, project administration, formal analysis, data curation, writing manuscript.
We fully thank all contributors to the morse
package: Marie Laure Delignette-Muller, Wandrille Duchemin, Benoît Goussen, Nils Kehrein, Guillaume Kon-Kam-King, Christelle Lopes, Philippe Ruiz, Alexander Singer and Philippe Veber. We also thank the French National Agency for Water and Aquatic Environments (ONEMA, now the French Agency for Biodiversity) for its financial support to the development of the morse
package. Finally, we thank a lot our reviewers, Tom Purucker and Peter Vermeiren, for their helpful comments leading to improvements in the manuscript and directly in the morse
package, as well as Kristina Riemer for the management of the review process and for being so patient in finalizing this manuscript.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.