In MarkEdmondson1234/googleCloudRunner: R Scripts in the Google Cloud via Cloud Run, Cloud Build and Cloud Scheduler

Being able to trigger R code via public HTTP requests opens up lots of possibilities using webhooks, API calls and so on. You can also make use of private HTTP requests so that services running multiple languages can also easily interface with one another using a concept called micro-services.

The use case described below uses R code deployed to a private as a strategy for running lots of parallel computation, taking advantage of Cloud Run's autoscaling features to turn up and down compute resources as needed.

Hosting the computation behind a plumber API and hosted on Cloud Run, you can control what data is worked on via URL parameters, and loop calling the API.

The R computation behind plumber could call other services or compute directly, although keep in mind the 60min timeout for Cloud Run API responses.

Creating the Cloud Run deployment

The demo API queries a BigQuery database and computes a forecast for the time-series it gets back. The data is from the Covid public datasets hosted on BigQuery.

The plumber script below has two endpoints:

/ to test its running
/covid_traffic which is fetched with parameters region and industry to specify what times-series to fetch a forecast for.

We will want these to be private (e.g. only authenticated calls) since they contain potentially expensive BigQuery queries.

plumber API script

if(Sys.getenv("PORT") == "") Sys.setenv(PORT = 8000)

#auth via BQ_AUTH_FILE environment argument
library(bigQueryR)
library(xts)
library(forecast)

#' @get /
#' @html
function(){
  "<html><h1>It works!</h1></html>"
}


#' @get /covid_traffic
#' @param industry the industry to filter results down to e.g "Software"
#' @param region the region to filter results down to e.g "Europe"
function(region=NULL, industry=NULL){

  if(any(is.null(region), is.null(industry))){
    stop("Must supply region and industry parameters")
  }

  # to handle spaces
  region <- URLdecode(region)
  industry <- URLdecode(industry)

  sql <- sprintf("SELECT date, industry, percent_of_baseline FROM `bigquery-public-data.covid19_geotab_mobility_impact.commercial_traffic_by_industry`  WHERE region = '%s' order by date LIMIT 1000", region)

  message("Query: ", sql)

  traffic <- bqr_query(
    query = sql,
    datasetId = "covid19_geotab_mobility_impact",
    useLegacySql = FALSE,
    maxResults = 10000
  )

  # filter to industry in R this time
  test_data <- traffic[traffic$industry == industry,
                       c("date","percent_of_baseline")]

  tts <- xts(test_data$percent_of_baseline,
             order.by = test_data$date,
             frequency = 7)

  # replace with long running sophisticated analysis
  model <- forecast(auto.arima(tts))

  # output a list that can be turned into JSON via jsonlite::toJSON
  o <- list(
    params = c(region, industry),
    x = model$x,
    mean = model$mean,
    lower = model$lower,
    upper = model$upper
  )

  message("Return: ", jsonlite::toJSON(o))

  o

}

Deploy to Cloud Run and reusing the default authentication

When you deploy to Cloud Run, you can choose which service key the Cloud Run service will run under, the default being the GCE default service key. This means you can authenticate using this key without needing to upload your own service JSON file. An example is available in the example Cloud Run app included with the package, deployable via cr_deploy_plumber(system.file("example", package = "googleCloudRunner"))

The relevant R code is shown below, which lets you list a Google Cloud Storage bucket within the same GCP project, reusing the default authentication.

#' List a Google Cloud Storage bucket as an auth example
#' @get /gcs_list
#' @param bucket the bucket to list.  Must be authenticated for this Cloud Run service account
function(bucket=NULL){
  if(is.null(bucket)){
    return("No bucket specified in URL parameter e.g ?bucket=my-bucket")
  }

  library(googleCloudStorageR)

  auth <- gargle::credentials_gce()
  if(is.null(auth)){
    return("Could not authenticate")
  }

  message("Authenticated with service token")

  # put it into googleCloudStorageR auth
  gcs_auth(token = auth)

  gcs_list_objects(bucket)

}

Once deployed, you can see it listing objects via the endpoint browseURL("https://{your-app-endpoint}/gcs_list?bucket={your-bucket}")

Deploy plumber API as a private micro-service

The first step is to host the plumber API above.

The steps below:

Download an authentication file for the script
Build the Docker container for the plumber API
Deploys the plumber API to a private Cloud Run instance
Creates the cloudbuild.yml and writes it to a file in the repo

library(googleCloudRunner)

bs <- c(
  cr_buildstep_secret("my-auth",
      decrypted = "inst/docker/parallel_cloudrun/plumber/auth.json"),
  cr_buildstep_docker("cloudrun_parallel",
                      dir = "inst/docker/parallel_cloudrun/plumber",
                      kaniko_cache = TRUE),
  cr_buildstep_run("parallel-cloudrun",
                   image = "gcr.io/$PROJECT_ID/cloudrun_parallel:$BUILD_ID",
                   allowUnauthenticated = FALSE,
                   env_vars = "BQ_AUTH_FILE=auth.json,BQ_DEFAULT_PROJECT_ID=$PROJECT_ID")
)

# make a buildtrigger pointing at above steps
cloudbuild_file <- "inst/docker/parallel_cloudrun/cloudbuild.yml"
by <- cr_build_yaml(bs)
cr_build_write(by, file = cloudbuild_file)

Create a build trigger so it builds upon each git commit

Now the cloudbuild.yml is available, a build trigger is created so that it will deploy upon each commit to this GitHub:

repo <- cr_buildtrigger_repo("MarkEdmondson1234/googleCloudRunner")
cr_buildtrigger(cloudbuild_file,
                "parallel-cloudrun",
                trigger = repo,
                includedFiles = "inst/docker/parallel_cloudrun/**")

The files are then committed to GitHub and the build reviewed in the GCP Console.

Once the builds are done, you should have a plumber API deployed with a URL similar to https://parallel-cloudrun-ewjogewawq-ew.a.run.app/

Calling the private plumber API with a JWT

Now the plumber API is up, but to call it with authentication will need a JSON Web Token, or JWT. These are supported via the cr_jwt_create() functions:

cr <- cr_run_get("parallel-cloudrun")

# Interact with the authenticated Cloud Run service
the_url <- cr$status$url
jwt <- cr_jwt_create(the_url)

# needs to be recreated every 60mins
token <- cr_jwt_token(jwt, the_url)

# call Cloud Run with token using httr call
library(httr)
res <- cr_jwt_with_httr(
  GET("https://parallel-cloudrun-ewjogewawq-ew.a.run.app/"),
  token)
content(res)

The JWT token needs renewing each hour.

Use your private micro-service

The above shows you can call the API's test endpoint, but now we wish to call the API many times with the parameters to filter to the data we want.

This is facilitated by a wrapper function to call our API:

# interact with the API we made
call_api <- function(region, industry, token){
  api <- sprintf(
    "https://parallel-cloudrun-ewjogewawq-ew.a.run.app/covid_traffic?region=%s&industry=%s",
    URLencode(region), URLencode(industry))

  message("Request: ", api)
  res <- cr_jwt_with_httr(httr::GET(api),token)

  httr::content(res, as = "text", encoding = "UTF-8")

}

# test call
result <- call_api(region = "Europe", industry = "Software", token = token)

Once the test call is complete, you can loop over your data - the parameters for this particular dataset are shown below:

# the variables to loop over
regions <- c("North America", "Europe","South America","Australia")
industry <- c("Transportation (non-freight)",
              "Software",
              "Telecommunications",
              "Manufacturing",
              "Real Estate",
              "Energy & Utilities",
              "Education",
              "Insurance",
              "Media & Internet",
              "Minerals & Mining",
              "Healthcare Services & Hospitals",
              "Organizations",
              "Finance")

Parallel calling of the API

For parallel processing, ideally you don't want R to block waiting for the HTTP response for each URL you are sending. This is easiest achieved through using curl and its curl_fetch_multi() function.

A helper function that takes care of adding multiple URLs to the same curl pool is cr_jwt_with_curl() - if you pass it a vector of URL strings to call and the token, it will attempt to call all of them at the same time.

An example calling each possible combination of the industries/regions above is shown below, which utilises expand.grid() to create the URL variations.

## curl multi asynch

# function to make the URLs
make_urls <- function(regions, industry){

  combos <- expand.grid(regions, industry, stringsAsFactors = FALSE)

  unlist(mapply(function(x,y){sprintf("https://parallel-cloudrun-ewjogewawq-ew.a.run.app/covid_traffic?region=%s&industry=%s",URLencode(x), URLencode(y))},
                SIMPLIFY = FALSE, USE.NAMES = FALSE,
         combos$Var1 ,combos$Var2))
}

# make all the URLs
all_urls <- make_urls(regions = regions, industry = industry)

# calling them
cr_jwt_async(all_urls, token = token)

Some example output is shown below - note that some combinations didn't have data so the API returned a 500 error, but later valid combinations completed and returned the JSON response for the forecast:

> # calling them
> cr_jwt_async(all_urls, token = token)
ℹ 2020-09-20 21:58:06 > Calling asynch:  https://parallel-cloudrun-ewjogewawq-ew.a.run.app/covid_traffic?region=North%20America&industry=Transportation%20(non-freight)
ℹ 2020-09-20 21:58:06 > Calling asynch:  https://parallel-cloudrun-ewjogewawq-ew.a.run.app/covid_traffic?region=Europe&industry=Transportation%20(non-freight)
ℹ 2020-09-20 21:58:06 > Calling asynch:  https://parallel-cloudrun-ewjogewawq-ew.a.run.app/covid_traffic?region=South%20America&industry=Transportation%20(non-freight)
ℹ 2020-09-20 21:58:13 > 500 failure for request https://parallel-cloudrun-ewjogewawq-ew.a.run.app/covid_traffic?region=North%20America&industry=Software
ℹ 2020-09-20 21:58:06 > Calling asynch:  https://parallel-cloudrun-ewjogewawq-ew.a.run.app/covid_traffic?region=Australia&industry=Transportation%20(non-freight)
ℹ 2020-09-20 21:58:06 > Calling asynch:  https://parallel-cloudrun-ewjogewawq-ew.a.run.app/covid_traffic?region=North%20America&industry=Software
[[1]]
[1] "{\"params\":[\"North America\",\"Transportation (non-freight)\"],\"x\":[100,99,108,104,105,104,105,102,102,108,104,105,104,105,99,98,59,78,78,78,78,100,100,108,106,105,105,105,100,103,109,104,104,105,106,103,102,108,104,104,103,100,96,97,67,64,62,60,59],\"mean\":[64.031,68.3173,71.9691,75.0803,77.731,79.9894,81.9135,83.5527,84.9493,86.1392],\"lower\":[[52.8354,46.9089],[53.6094,45.8236],[55.1655,46.2703],[56.9063,47.2856],[58.6237,48.5089],[60.2322,49.7734],[61.6977,50.9961],[63.0104,52.136],[64.1733,53.1751],[65.1951,54.108]],\"upper\":[[75.2265,81.1531],[83.0251,90.8109],[88.7726,97.6679],[93.2543,102.8751],[96.8384,106.9531],[99.7466,110.2054],[102.1292,112.8308],[104.095,114.9694],[105.7254,116.7235],[107.0833,118.1704]]}"

[[2]]
[1] "{\"params\":[\"South America\",\"Transportation (non-freight)\"],\"x\":[14,13,12,9,13,17,19,15,11,14,15,16,16,14,11,15,19,18,16,16,12,18,21,18,18,12,18,19,18,19,16,15,17,20,21,20,18,24,26,25,18,22,19,17,21,24,20,19,19,23,27,27,24,22,17,24,26,24,20,18,21,24,23,20,26,19,20,17,21,26,23,19,17,27,23,20,21,22,17,26,23,20,18,21,19,18,28,22,18,24,15,21,27,21,21],\"mean\":[19.7145,21.4333,23.6604,24.1475,22.5433,20.6944,20.6384,22.468,24.3021,24.3207],\"lower\":[[15.9943,14.0249],[17.6591,15.6611],[19.8826,17.8828],[20.3669,18.3656],[18.7179,16.6929],[16.7392,14.6454],[16.566,14.4103],[18.3612,16.1872],[20.1922,18.0166],[20.2036,18.0241]],\"upper\":[[23.4346,25.404],[25.2075,27.2054],[27.4381,29.438],[27.928,29.9293],[26.3686,28.3936],[24.6495,26.7433],[24.7107,26.8665],[26.5747,28.7487],[28.412,30.5876],[28.4378,30.6173]]}"

[[3]]
[1] "{\"params\":[\"Europe\",\"Transportation (non-freight)\"],\"x\":[40,52,55,50,52,44,59,55,54,60,63,43,54,49,46,38,64,55,53,51,58,61,42,43,40,54,53,60,63,54,44,43,52,38,53,51,47,53,58,44,54,62,56,59,63,48,62,66,60,68,73,69,71,77,77,63,66],\"mean\":[68.5688,68.5688,68.5688,68.5688,68.5688,68.5688,68.5688,68.5688,68.5688,68.5688],\"lower\":[[58.5753,53.2851],[58.1036,52.5637],[57.6523,51.8735],[57.2189,51.2107],[56.8015,50.5723],[56.3984,49.9558],[56.0082,49.359],[55.6297,48.7802],[55.2621,48.2179],[54.9043,47.6707]],\"upper\":[[78.5622,83.8525],[79.0339,84.5738],[79.4852,85.2641],[79.9186,85.9269],[80.3361,86.5653],[80.7392,87.1818],[81.1294,87.7786],[81.5078,88.3573],[81.8755,88.9196],[82.2333,89.4668]]}"

MarkEdmondson1234/googleCloudRunner documentation built on June 15, 2025, 2 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

MarkEdmondson1234/googleCloudRunner
R Scripts in the Google Cloud via Cloud Run, Cloud Build and Cloud Scheduler

In MarkEdmondson1234/googleCloudRunner: R Scripts in the Google Cloud via Cloud Run, Cloud Build and Cloud Scheduler

Creating the Cloud Run deployment

plumber API script

Deploy to Cloud Run and reusing the default authentication

Deploy plumber API as a private micro-service

Create a build trigger so it builds upon each git commit

Calling the private plumber API with a JWT

Use your private micro-service

Parallel calling of the API

R Package Documentation

Browse R Packages

We want your feedback!

MarkEdmondson1234/googleCloudRunner R Scripts in the Google Cloud via Cloud Run, Cloud Build and Cloud Scheduler

In MarkEdmondson1234/googleCloudRunner: R Scripts in the Google Cloud via Cloud Run, Cloud Build and Cloud Scheduler

Creating the Cloud Run deployment

plumber API script

Deploy to Cloud Run and reusing the default authentication

Deploy plumber API as a private micro-service

Create a build trigger so it builds upon each git commit

Calling the private plumber API with a JWT

Use your private micro-service

Parallel calling of the API

R Package Documentation

Browse R Packages

We want your feedback!

MarkEdmondson1234/googleCloudRunner
R Scripts in the Google Cloud via Cloud Run, Cloud Build and Cloud Scheduler