knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" ) library(magrittr)
When using the code included in this research compendium, please cite all of the following:
d'Alpoim Guedes, Jade and R. Kyle Bocinsky. Climate change stimulated agricultural innovation and exchange across Asia. In review.
d'Alpoim Guedes, Jade and R. Kyle Bocinsky. Research compendium for: Climate change stimulated agricultural innovation and exchange across Asia, 2018. Version 1.0.0. Zenodo. https://doi.org/10.5281/zenodo.1239106
d'Alpoim Guedes, Jade and R. Kyle Bocinsky. Data output for: Climate change stimulated agricultural innovation and exchange across Asia, 2018. Version 1.0.0. Zenodo. http://doi.org/10.5281/zenodo.788601
The files at the URL above will generate the results as found in the publication. The files hosted at https://github.com/bocinsky/gutaker2020 are the development versions and may have changed since this compendium was released.
This repository is a research compendium package for the thermal niche model presented in Gutaker et al. (2020), and initially developed in d'Alpoim Guedes and Bocinsky (2018). The compendium contains all code associated with the analyses described and presented in the publication, as well as a Docker environment (described in the Dockerfile
) for running the code.
This compendium is an R package, meaning that by installing it you are also installing most required dependencies. See below for hints on installing some of the command-line tools necessary in this analysis on macOS and Linux. This compendium takes a lot of its cues from Ben Marwick's rrtools
package for performing reproducible research.
The analyses presented in Gutaker et al. (2020) are performed in an RMarkdown vignette (GutakerEtAl2020.Rmd
) located in the analysis
directory.
git
To download this research compendium as you see it on GitHub, for offline browsing, install git on your computer and use this line at a Bash prompt ("Terminal" on macOS and Unix-alikes, "Command Prompt" on Windows):
# Clone into the repository git clone https://github.com/bocinsky/gutaker2020.git # Change directories into the local repository cd gutaker2019 # Checkout the publication tag git checkout tags/1.0.0
Among the system dependencies for this package are GDAL, FFMPEG, and Ghostscript, V8 v3.15, and protobuf. These packages (and their respective dependencies) must be installed in order to run the analyses. Additionally, Cairo must be among the capabilities of your particular R installation (as it probably is if you installed from a pre-compiled binary download available on CRAN), and a recent versions of Pandoc is required for building the README.md file.
We strongly suggest using Homebrew to install the system dependencies. Homebrew commands might be:
brew install gdal --with-complete --with-unsupported brew install ffmpeg brew install ghostscript brew install protobuf brew install v8@3.15 brew install pandoc brew install pandoc-citeproc
Please refer to the dockerfiles for rocker/geospatial and bocinsky/bocin_base.
This software has not been tested on Windows, but should install and work fine if all system requirements are installed.
Some installations of R---particularly R >= 3.5.0 running on macOS---will throw a "vector memory exhausted" error when running the analysis. This occurs when R allocates larger vectors than allowed by default; see the R NEWS file for 3.5.0 for details. If you get this error, increasing the R_MAX_VSIZE
environment variable might solve the issue. Run these lines in the terminal:
cd ~
touch .Renviron
open .Renviron
Then, add this to the first line of .Renviron
:
R_MAX_VSIZE=100Gb
This analyses requires the user to have the Google Elevation API key as environment variables or passed to the GutakerEtAl2019.Rmd
RMarkdown vignette as parameters. Please see the [Running the analysis] sections below for guidance on setting these parameters.
There are three ways to run the analysis:
This analysis has been designed to take advantage of modern multi-core or multi-CPU computer architectures. By default, it will run on two cores—i.e., sections of the code will run in parallel approximately twice as fast as on a single core. The analysis also consumes quite a bit of memory. On two (relatively high-speed) cores, run-time of the entire analysis is approximately 12 hours. This can be sortened dramatically by running with a higher number of cores/processors and amount of memory, if available.
This is what most users will want want to run if your goal is to explore how we developed the model, or to change parameters. Be sure that you have a working version of R installed (>= 3.5.1) and the RStudio development environment.
guedesbocinsky2018-1.0.0
directory. r
## Install the devtools package, if not previously installed
# install.packages("devtools")
devtools::install_cran("remotes", upgrade_dependencies = FALSE)
devtools::install(".", dependencies = TRUE, upgrade_dependencies = FALSE)
remotes::install_local(".")
analysis/
directory.GutakerEtAl2020.Rmd
.!r
(through the end of the line) with your Google Maps Elevation API key (in single quotes). It should look something like this before replacement:
This is what you want to run to reproduce our results from the terminal. We strongly encourage you to run the analysis from R and RStudio if your goal is to explore how we developed the model, or to change parameters.
To run this analysis from the terminal, first you must ensure you have downloaded the compendium package and installed all system requirements. We've included a convenient script for running the entire analysis, including installing the compendium package.
First, set your environment variables in the terminal. On Unix-alike systems (including Linux and macOS), you can set environmental variables in the terminal like so:
export google_maps_elevation_api_key=YOUR_API_KEY
Then, from within the gutaker2020_rice_niche
directory in the terminal:
bash inst/gutaker2020_rice_niche_BASH.sh
Output will appear in the vignettes/
directory.
This is what you want to run to reproduce our results precisely. We strongly encourage you to run the analysis from R and RStudio if your goal is to explore how we developed the model, or to change parameters.
Docker is a virtual computing environment that facilitates reproducible research---it allows for research results to be produced independent of the machine on which they are computed. Docker users describe computing environments in a text format called a "Dockerfile", which when read by the Docker software builds a virtual machine, or "container". Other users can then load the container on their own computers. Users can upload container images to Docker Hub, and the image for this research (without the analyses run) is available at https://hub.docker.com/r/bocinsky/gutaker2020_rice_niche/.
We have included a Dockerfile which builds a Docker container for running the analyses described in the paper. It uses rocker/geospatial:3.4.4
, which provides R, RStudio Server, the tidyverse of R packages as its base image and adds several geospatial software packages (GDAL, GEOS, and proj.4. The Dockerimage (1) adds ffmpeg, (2) updates the R packages, and (3) installs the R software packages required by this package.
The commands below demonstrate three ways to run the docker container. See this Docker cheat sheet for other arguments. Using the ":1.0.0" tag will ensure you are running the version of the code that generates the d'Alpoim Guedes and Bocinsky (2018) results---the first time you run the Docker image, it will download it from the Docker Hub.
Set your environment variables in the terminal. On Unix-alike systems (including Linux and macOS), you can set environmental variables in the terminal like so:
export google_maps_elevation_api_key=YOUR_API_KEY
To run the analyses directly, render the gutaker2020_rice_niche.Rmd
RMarkdown document at the end of the run command like so (in the terminal):
docker exec bocinsky/gutaker2020_rice_niche:1.0.0 r -e "rmarkdown::render('/gutaker2020_rice_niche/analysis/gutaker2020_rice_niche.Rmd', \ params = list(cores = 1, \ clean = FALSE, \ google_maps_elevation_api_key = '$google_maps_elevation_api_key'))"
Alternatively, you can run the container in interactive mode and load the script yourself like so (in the terminal):
docker exec -it bocinsky/gutaker2020_rice_niche:1.0.0 bash
You can use the exit
command to stop the container.
Finally, you can host RStudio Server locally to use the RStudio browser-based IDE. Run like so (in the terminal):
docker exec -p 8787:8787 bocinsky/gutaker2020_rice_niche:1.0.0
Then, open a browser (we find Chrome works best) and navigate to "localhost:8787" or or run docker-machine ip default
in the shell to find the correct IP address, and log in with rstudio/rstudio as the user name and password. In the explorer (lower right pane in RStudio), navigate to the guedesbocinsky2018
directory, and click the GutakerEtAl2019.Rproj
to open the project.
If you wish to build the Docker container locally for this project from scratch, simply cd
into this gutaker2019/
directory and run like so (in the terminal):
docker build -t bocinsky/gutaker2019 .
The -t
argument gives the resulting container image a name. You can then run the container as described above, except without the tag.
We have also included a bash script that builds the Docker container, executes the analysis, and moves the results onto your local machine. To use it, open the terminal, make sure you are in the gutaker2019/
directory, then run the following:
First, set your environment variables in the terminal. On Unix-alike systems (including Linux and macOS), you can set environmental variables in the terminal like so:
export google_maps_elevation_api_key=YOUR_API_KEY
Then, change into the gutaker2019/
directory, and run the convenience script:
bash inst/gutaker2019_DOCKER.sh
The entire analysis will appear in a docker_out/
directory when the analysis finishes.
The GitHub repository for this project does not contain the output generated by the script---r "analysis/zenodo" %>% list.files(all.files = TRUE, recursive = TRUE, full.names = TRUE) %>% file.size() %>% sum() %>% magrittr::divide_by(1000000000) %>% round(digits = 2)
GB of compressed data. All output data is available as a separate Zenodo archive at:
The vignettes/
directory contains all data generated by the GutakerEtAl2019.Rmd
RMarkdown vignette:
data/raw_data
contains data downloaded from web sources for this analysisdata/derived_data/
contains tables of the raw site chronometric data without locational information, and the modeled chronometric probability and niche information for each site.data/derived_data/models/
contains R data objects describing the Kriging interpolation models across the study areadata/derived_data/recons/
contains NetCDF format raster bricks of the model output (i.e., the reconstructed crop niches)figures/
contains all figures output by the script, including videos of how each crop niche changes over timesubmission/
contains all of the figures, tables, movies, and supplemental datasets included with Gutaker et al. (2019)Text and figures : CC-BY-4.0
Code : GNU GPLv3
Data : CC-0 attribution requested in reuse
We welcome contributions from everyone. Before you get started, please see our contributor guidelines. Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
This compendium was created using the rrtools
package by Ben Marwick, which is ✨ pure magic ✨ for doing reproducible research.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.