knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
It can be tricky to deal with auth in a non-interactive setting on a remote machine. Specifically, we're thinking about running tests on a continuous integration (CI) service, such as GitHub Actions, or deploying a data product, such as a Shiny app.
This article documents a token management approach for packages and apps that use gargle, which includes packages like googledrive, googlesheets4, bigrquery, and gmailr. We want it to be relatively easy to have a secret, such as a service account token, that we can:
all while keeping the secret secure.
The approach uses symmetric encryption, where the shared key is stored in an environment variable.
Why?
This works well with existing conventions for local R usage.
Most CI or hosting services offer support for secure environment variables.
And R-hub accepts environment variables via the env_vars
argument of rhub::check()
.
This mostly uses functions inlined from the httr2 (https://httr2.r-lib.org/) package, which gargle does not (yet) depend on.
library(gargle)
GARGLE_KEY
. Store as an environment variable.Pick a name for the encryption key.
I recommend that it be clearly associated with whatever package or data product you plan to use it with.
For example, gargle's testing credentials are encrypted with a key named GARGLE_KEY
.
You don't need to store this name as a variable. We're only doing so because it makes this exposition easier.
key_name <- "SOMETHING_KEY"
In real life, you should keep the output of secret_make_key()
to yourself!
We reveal it here as part of the exposition.
key <- secret_make_key() key
gargle::secret_make_key()
is a copy of httr2::secret_make_key()
.
.Renviron
Combine the key name and value to form a line like this in your user-level .Renviron
file:
cat(paste0(key_name, "=", key), sep = "\n")
usethis::edit_r_environ()
can help create or open this file.
I strongly recommend using the user-level .Renviron
, as opposed to project-level, because this makes it less likely you will share sensitive information by mistake.
If for some reason you choose to store the key in a file inside a Git repo, you must make sure that file is listed in .gitignore
.
This still would not prevent leaking your secret if, for example, that project is in a directory that syncs to DropBox or Google Drive (i.e. any service that has no real notion of an "ignore" file).
Remember you'll need to restart R (or call readRenviron("~/.Renviron")
) for the newly defined environment variable to take effect.
Sys.setenv(SOMETHING_KEY = key)
In an interactive session, you can call Sys.getenv()
to do a quick check that the key is setup correctly locally:
Sys.getenv("SOMETHING_KEY")
This Sys.getenv()
call is exactly the sort of thing you should be very careful about doing in a deployed setting, where the result could up in a (semi-)public log file.
The Google auth ecosystem involves different types of secrets, which require slightly different handling when you're placing an encrypted version inside your project.
secret_encrypt_json()
is a gargle-specific function, built on top of httr2's secret management machinery.
This is because JSON files and strings are especially relevant to auth in the Google ecosystem.
You will be interested in secret_encrypt_json()
if you want to encrypt a service account key (or, even, an OAuth client).
secret_encrypt_json()
takes 3 arguments:
json
: probably the path to a JSON file, but a JSON string is also
acceptable. path
: The path to write the encrypted JSON to. Technically this is
optional, but this function mostly exists to write to file.key
: The name of the environment variable that holds the encryption key.This example shows how googledrive's testing credentials are placed inside the package source.
googledrive-testing.json
is a JSON file downloaded for a service account managed via the Google API / Cloud Platform console:
secret_encrypt_json( json = "~/some/place/where/I/keep/secret/stuff/googledrive-testing.json", path = "inst/secret/googledrive-testing.json", key = "GOOGLEDRIVE_KEY" )
This writes an encrypted version of googledrive-testing.json
to inst/secret/googledrive-testing.json
relative to the current working directory, which is presumably the top-level directory of googledrive's source.
This encrypted file should be committed and pushed.
Later we show how to use secret_decypt_json()
to decrypt this token.
gargle::secret_write_rds()
is a copy of httr2::secret_write_rds()
, exported by gargle for convenience.
If you must encrypt an R object, such as a Gargle2.0
user token, this is the function you need.
But note that it should be quite rare to encrypt a user token.
If at all possible, use a service account instead.
secret_write_rds()
takes 3 arguments:
x
: The R object to encrypt. In the gargle context, this is usually a
token. After a successful OAuth dance, wrapper packages often provide access
to the token with a function like googledrive::drive_token()
,
googlesheets4::gs4_token()
, bigrquery::bq_token()
, or
gmailr::gm_token()
.path
: The path to write the encrypted object to. Technically this is
optional, but this function mostly exists to write to file.key
: The name of the environment variable that holds the encryption key.This example shows how an encrypted googlesheets4 user token could be placed inside the .secrets/
directory of a project, e.g. a Shiny app intended for deployment.
library(googlesheets4) dir.create(".secrets") # get a token and DO NOT CACHE IT gs4_auth("someone@example.com", cache = FALSE) # encrypt the token and write to file gargle::secret_write_rds( gs4_token(), ".secrets/gs4-token.rds", key = "SOMETHING_KEY" )
This writes an encrypted version of the token to .secrets/gs4-token.rds
.
This encrypted file should be committed and pushed/deployed.
Later we show how to use gargle::secret_read_rds()
to decrypt this token.
Here's how you make the encryption key available when your code is running elsewhere.
Define the environment variable as an encrypted secret in your repo:
https://docs.github.com/en/actions/security-guides/encrypted-secrets
Use the secrets context to expose a secret as an environment variable in your workflows. That will look like like so, in some appropriate place in your workflow file:
env: SOMETHING_KEY: ${{ secrets.SOMETHING_KEY }}
The secret, and therefore the associated environment variable, is not available when workflows are triggered via an external pull request.
Send the environment variable in your calls to rhub::check()
and friends:
rhub::check(env_vars = Sys.getenv("SOMETHING_KEY", names = TRUE))
Define the environment variable in the {X} Vars pane of the dashboard for your content:
https://docs.posit.co/connect/user/content-settings/#content-vars
It should come as no surprise that secret_encrypt_json()
and secret_write_rds()
each have a companion function for decryption: secret_decrypt_json()
and secret_read_rds()
, respectively.
Recall that in the example above we encrypted the JSON specifying a service account token, for use in CI by googledrive.
Here's how you would use secret_decrypt_json()
to decrypt that token and direct googledrive to use it:
library(googledrive) drive_auth( path = gargle::secret_decrypt_json( system.file("secret", "googledrive-testing.json", package = "googledrive"), "GOOGLEDRIVE_KEY" ) )
Recall that in the example above we encrypted a googlesheets4 user token, for use inside something like a deployed Shiny app.
Here's how you would use secret_read_rds()
to decrypt that token and direct googlesheets4 to use it:
library(googlesheets4) gs4_auth(token = gargle::secret_read_rds( ".secrets/gs4-token.rds", key = "SOMETHING_KEY" ))
The snippets above are great when they work, i.e. when "SOMETHING_KEY"
is available for decryption.
But what about when the key isn't available?
You do want to rig things for graceful, informative failure in this case.
secret_has_key("SOMETHING_KEY")
reports whether the "SOMETHING_KEY"
environment variable is defined.
In a deployed data product, you might want to call secret_has_key()
before any attempt to decrypt a secret.
If the encryption key is not available, report that finding and arrange to do something graceful instead of erroring, especially in some cryptic, difficult-to-debug way.
The secret_*
functions have a built-in feature such that, if they are called during testing, when the encryption key is unavailable, that test is skipped.
That behaviour is implemented in the internal helper secret_get_key()
, which looks something like this:
secret_get_key <- function(envvar) { key <- Sys.getenv(envvar) if (identical(key, "")) { if (is_testing()) { msg <- glue("Env var {envvar} not defined.") testthat::skip(msg) } else { # error } } # return the key }
If envvar
(presumably, SOMETHING_KEY
or the like) is undefined, during tests, that test is just skipped.
Note that "during tests" is defined as when is_testing()
returns TRUE
.
The is_testing()
helper is defined like so:
is_testing <- function() { identical(Sys.getenv("TESTTHAT"), "true") }
Therefore automatic skipping will happen during automated testing, including on CRAN, and for external contributors.
The automatic skips won't kick in when you're just, e.g., running a single test "by hand".
The "TESTTHAT"
environment variable is set by functions like devtools::test()
or testthat::test_file()
.
I will also point out that this is not how test skipping is achieved in packages like googledrive, googlesheets4, bigrquery, and gmailr.
Those packages are all designed to load a token into an internal auth state, then use that token in downstream requests.
This means that individual requests or tests won't ever call secret_decrypt_json()
or secret_read_rds()
, so the automatic skips aren't relevant.
These packages make different arrangements for skipping auth-requiring tests when the testing credentials are unavailable.
The source code for those packages is the best place to learn more.
Start by consulting the package's tests/testthat/helper.R
file.
I recommend that you actively check your package under the "no decryption, no token" scenario, so that you discover problems before CRAN or your contributors do.
In fact, this should probably be the default situation for your R CMD check
workflow.
In auth-requiring package, we usually have two R CMD check
workflows:
R-CMD-check.yaml
is the main workflow, which tests the package against a
relatively large matrix of operating systems and R versions. This workflow
does not have access to the encryption key.with-auth.yaml
is another R CMD check
workflow that only checks with the
released version of R, on ubuntu-latest
. This workflow does have access to
the encryption key. Here's the bit of the .yaml
file where that happens:- uses: r-lib/actions/check-r-package@v2 env: SOMETHING_KEY: ${{ secrets.SOMETHING_KEY }}
Look at the GitHub Actions workflow configurations for googledrive, googlesheets4, bigrquery, and gmailr, to see some concrete examples.
The Wrapping APIs vignette for the httr2 package, specifically the "Secret management" section.
The How does cryptography work? vignette for the sodium package, specifically the "Symmetric encryption" section.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.