knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
While renv
can help capture the state of your R library at some point in time,
there are still other aspects of the system that can influence the runtime
behavior of your R application. In particular, the same R code can produce
different results depending on:
And so on. Docker is a tool that helps solve this problem through the use of containers. Very roughly speaking, one can think of a container as a small, self-contained system within which different applications can be run. Using Docker, one can declaratively state how a container should be built (what operating system it should use, and what system software should be installed within), and use that system to run applications. (For more details, please see https://environments.rstudio.com/docker.)
Using Docker and renv
together, one can then ensure that both the underlying
system, alongside the required R packages, are fixed and constant for a
particular application.
The main challenges in using Docker with renv
are:
renv
cache is visible to Docker containers, andrenv
restores the R packages as required when the
container is run.This vignette will assume you are already familiar with Docker; if you are not
yet familiar with Docker, the Docker Documentation
provides a thorough introduction. To learn more about using Docker to manage R
environments, visit
environments.rstudio.com. We'll
discuss two strategies for using renv
with Docker:
renv
to install packages when the Docker image is generated;renv
to install packages when Docker containers are run.We'll explore the pros and cons of each strategy.
With Docker, Dockerfiles are used to define new images. Dockerfiles can be used to declaratively specify how a Docker image should be created. A Docker image captures the state of a machine at some point in time -- e.g., an Ubuntu operating system after downloading and installing R 3.5. Docker containers can be created using that image as a base, allowing isolated applications to run using the same pre-defined machine state.
First, you'll need to get renv
installed on your Docker image. The easiest
way to accomplish this is with the remotes
package. For example:
ENV RENV_VERSION `r renv:::renv_package_version("renv")` RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))" RUN R -e "remotes::install_github('rstudio/renv@${RENV_VERSION}')"
Now, renv
can be used to install packages on the image. If you'd like the
renv.lock
lockfile to be used to install R packages when the Docker image is
built, you can include something of the form:
WORKDIR /project COPY renv.lock renv.lock RUN R -e 'renv::restore()'
With this, renv
will download and install packages from CRAN and other
external sources as appropriate when the image is created.
There are two main downsides to this approach:
The set of R packages used is pre-baked into the image, so different applications or containers built from this image will either have to re-use the aforementioned set of packages, or reinstall the packages they need to update as required.
With this approach, the renv
package cache will not be used. This
implies that package installation through renv::restore()
may be
very slow, as all packages will have to be installed.
Both of these issues can be solved if package installation can be deferred to container runtime.
If you'd like to leverage the renv
package cache alongside Docker, then
you'll need to alter how your containers are created so that renv
can
ensure the project library is initialized before your application is run.
One can control the renv
cache directory with the environment variable
RENV_PATHS_CACHE
. For example:
Sys.setenv(RENV_PATHS_CACHE = "~/path/to/cache") renv:::renv_paths_cache()
Note that the platform and R version in use are appended to the requested cache directory. This ensures that a single directory can act a base of cached packages for multiple different platforms and R versions.
Next, we need to figure out how to tell the Docker containers we create
to use this cache. The most common option here is to mount a directory
in the container that maps to persistent storage on the host system, and
then set the aforementioned RENV_PATHS_CACHE
environment variable to
point to this mount. You can specify this when the container is launched.
For example, if you had a container running a Shiny application:
# the path to an renv cache on the host machine RENV_PATHS_CACHE_HOST=/opt/local/renv/cache # the path to the cache within the container RENV_PATHS_CACHE_CONTAINER=/renv/cacheA # run the container with the host cache mounted in the container docker run --rm \ -e "RENV_PATHS_CACHE=${RENV_PATHS_CACHE_CONTAINER}" \ -v "${RENV_PATHS_CACHE_HOST}:${RENV_PATHS_CACHE_CONTAINER}" \ -p 14618:14618 \ R --vanilla -s -e 'renv::restore(); shiny::runApp(host = "0.0.0.0", port = 14618)'
With this, any calls to renv
APIs within the created docker container will
have access to the mounted cache. The first time you run a container, renv
will likely need to populate the cache, and so some time will be spent
downloading and installing the required packages. Subsequent runs should be much
faster, as renv
will be able to reuse the global package cache.
The primary downside with this approach compared to the image-based approach
is that it requires you to modify how containers are created, and requires
a bit of extra orchestration in how containers are launched. However, once
the renv
cache is active, newly-created containers will launch very quickly,
and a single image can then be used as a base for a myriad of different
containers and applications, each with their own private R library.
When \R is launched within a project folder, he renv
auto-loader (if present)
will attempt to download and install renv
into the project library. Depending
on how your Docker container is configured, this could fail. For example:
Error installing renv: ====================== ERROR: unable to create ‘/usr/local/pipe/renv/library/master/R-4.0/x86_64-pc-linux-gnu/renv’ Warning messages: 1: In system2(r, args, stdout = TRUE, stderr = TRUE) : running command ''/usr/lib/R/bin/R' --vanilla CMD INSTALL -l 'renv/library/master/R-4.0/x86_64-pc-linux-gnu' '/tmp/RtmpwM7ooh/renv_0.12.2.tar.gz' 2>&1' had status 1 2: Failed to find an renv installation: the project will not be loaded. Use `renv::activate()` to re-initialize the project.
Bootstrapping renv
into the project library might be un-necessary for you. If
that is the case, then you can avoid this behavior by launching R with the
--vanilla
flag set; for example:
R --vanilla -s -e 'renv::restore()'
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.