edmaps
is an R package that greatly simplifies the computational steps required for creating maps of establishment as a function of pathway arrival rates, climate suitability and host availability.
It uses a framework developed in a recent CEBRA report Camac et al. (2020) Developing pragmatic maps of establishment likelihood for plant pests, which has since been expanded and updated in the latest CEBRA report Camac et al. (2021) Using edmaps & Zonation to inform multi-pest early-detection surveillance designs.
It works by using a user-defined Microsoft Excel spreadsheet (see below) to create a make-like workflow that automates the entire process from raw data processing through to the production of GeoTIFF rasters as well as presentation-quality static and interactive maps.
Users greatly benefit from workflows using edmaps
because they define the dependency structure of those objects, and ensure that after modifications (e.g., changes to files, R objects, function arguments), any affected dependants are updated. In order to create such workflows edmaps
uses the drake
package for R.
Specifically, edmaps
automates the many steps that need to be taken in order to create pest-specific maps of establishment likelihoods.
Broadly, these steps are grouped as follows:
edmaps
edmaps
estimated establishment likelihoods by harnessing data from a wide range of publicly available spatial datasets created by Australian governments and scientists.
A collated repository of the required spatial layers needed for edmaps
can be downloaded from here.
Broadly this repository contains the following data types:
edmaps
in identifying coastal zones and area of extent)citrus_native_hosts_example.tif
(An example native host layer created using edmaps::rasterize_range()
)containers_bypostcode.xls
)parameters.xlsx
(An example file for specifying global and species parameters)renv.lock
(A renv lock file that contains the R package dependencies required to run edmaps outside of docker)To use edmaps
first download this data repository. This can be done by either visiting the website, clicking Code
and then Download ZIP
. Or by using git to download the repository via the shell/terminal/Rstudio terminal:
git clone https://github.com/jscamac/edmaps_data_Australia.git
Once the data repository is downloaded you can access the user_input/parameters.xlsx
workbook.
This excel workbook is the main user-defined file that defines the workflow that edmaps will implement.
The workbook contains two tabs one for specifying global parameters and another for specifying pest-specific parameters.
The first tab (Global variables) defines the set of parameters that are relevant to all pests considered. These include various input data paths, GBIF account details, distance decay parameters and whether interactive html maps should be produced.
The second tab (Species-specific parameters) contains pest-specific parameters, where each pest is a separate row in the spreadsheet. These parameters encompass:
The excel workbook provides tooltips for providing additional information to allow the file to be correctly parametrised.
edmaps
edmaps
depends on multiple spatial libraries (e.g. gdal, proj) and R packages to be installed.
While these can be manually installed, we advise that this is done as a last resort.
This is because each computer is different, likely using different versions of spatial libraries or R packages. Recently there has been considerable changes to many spatial R packages and libraries which have prevented backwards compatibility. As such it is extremely difficult ensure edmaps
will run in all settings.
For this reason, and to ensure edmaps
functionality into perpetuity, we have designed edmaps
to be implemented via a docker container.
Docker is the world’s leading software container platform that is used to create lightweight, self-contained virtual Linux systems that contain all relevant open source software required to run developed software.
Unlike other virtual machines, Docker does not bundle a full operating system.
Rather it only installs libraries and settings required to make the developed software work.
This means that Docker can be used to eliminate the “it works on my machine” problems when running software.
The other major advantage is that it removes the need for users to install software dependencies, as the hard work is already done.
In the following sections we outline three methods for running edmaps
, given that parameters.xlsx
has been filled out.
While we recommend the Docker approach, we acknowledge that not all users will have the appropriate permissions to do so.
As such we also outline one additional approach for implementation using the R package renv
. This approach will ensure the correct versions of R pacakge dependencies are installed, however, it cannot ensure the correct version of system requirements (e.g. R, Java, pandoc, proj and gdal libraries) are installed.
edmaps
with Docker (recommended)We have created a Docker image that contains all the system libraries, software (e.g. R, Java, pandoc) and R packages
required to install and run workflows produced by edmaps
.
To use the image, first install Docker onto your machine.
When installing Docker on Windows, you will be prompted to select whether to use Linux or Windows containers.
Leave this at it's default (i.e. to use Linux containers). Once installed, we recommend users set Docker settings
such that containers have access to at least 16 GB of RAM, and a specified number of CPUs (We recommend that users allow access up to all but one core for processing workflows derived from edmaps
, especially if being used for many pests).
Next we can download an edmaps
virtual machine by entering the following into the command line using Command Prompt or the terminal (requires internet connection):
docker pull jscamac/edmaps
Once the Docker image has been successfully downloaded, use the Command Prompt or terminal to navigate to the local copy of the data directory outlined above. Next run the system-specific command line:
macOS & Linux
docker run -d -v $(pwd):/home/rstudio/ -p 127.0.0.1:8787:8787 \
-e DISABLE_AUTH=true jscamac/edmaps
NOTE: Windows users may need to replace $(pwd) with the path to the downloaded repository or possibly %cd%.
The above command line will launch a virtual machine containing a local RStudio server that has access to your data directory. You can open this Rstudio session by opening your web browser and navigating to the following address: localhost:8787/
Once you can see the Rstudio server, and you've specified your parameters in the user_input/parameters.xlsx
workbook, you can now build your R workflow copying the following code into the R console:
## Prevent rJava issue with OpenStreetMaps
Sys.setenv(NOAWT=1)
## Load edmaps
library(edmaps)
# Set drake options.
# - Run make interactively
options(drake_make_menu = FALSE)
# - Set up parallel processing
future::plan(future.callr::callr)
# Build edmaps plan based on input workbook
edmaps_plan <- edmaps::excel_to_plan(file = 'user_input/parameters.xlsx')
Next we can run the workflow by doing the following:
drake::make(edmaps_plan, retries=3)
This will then process the raw spatial files, estimate establishment likelihoods and produce static and interactive maps as well as GIS-compatible rasters (GeoTIFF) can be found within the outputs/
directory. Depending on both how much computing resources you have given Docker access to, as well as the number of species you are estimating establishment likelihoods, this may take anywhere from 30 minutes to a couple of hours to run.
If you would like to generate a log of the entire workflow used by edmaps based on your parameter inputs, you can obtain that by running:
drake::plan_to_code(edmaps_plan, 'plan.log')
edmaps
on local computer with renv
If insufficient privileges exist to use Docker, an alternative is to use a renv
(reproducible environments) workflow.
The renv
R package reproduces a pre-defined package environment, ensuring that specified package versions are used. To do this it creates a private, isolated package library for an R project,
and obtains defined versions of packages from defined repositories.
Versions and repositories are specified in a "lock file" (renv.lock
), and the lock file describing the versions
used for the case studies has been provided with this report. Since the renv
package library is project-specific,
it will not overwrite or interfere with package versions used in other projects.
The most important difference between the Docker approach described above, and renv, is that the latter manages R packages only; required system libraries and tools must be installed manually.
Assuming software dependencies (e.g. Java, R) are available, recreating the package environment with renv
is straightforward.
First, navigate to the data directory (see above) using R/Rstudio. Then in the console run:
# Install renv if necessary
if(!requireNamespace('renv', quietly=TRUE) || packageVersion('renv') != '0.15.2') {
if(!dir.exists('lib')) dir.create('lib')
install.packages(
'renv', repos = 'https://cran.microsoft.com/snapshot/2022-02-10', lib='lib')
library(renv, lib.loc='lib')
}
# Set up package environment
renv::consent(provided=TRUE)
renv::init(bare=TRUE, settings=list(use.cache=FALSE))
renv::restore(clean=TRUE, prompt=FALSE)
This will create the local R package environment within the data repository. Once this has installed all R dependencies, you should be able to run build the workflow and run edmaps
in your own R session using the following the same code as above.
If you are having issues implementing edmaps
feel free to create an issue here outlining your problem. I'll endeavour to resolve it ASAP.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.