cr_build_targets: Set up Google Cloud Build to run a targets pipeline

View source: R/build_targets.R

cr_build_targetsR Documentation

Set up Google Cloud Build to run a targets pipeline

Description

Creates a Google Cloud Build yaml file so as to execute tar_make pipelines

Historical runs accumulate in the configured Google Cloud Storage bucket, and the latest output is downloaded before tar_make executes so up-to-date steps do not rerun.

Usage

cr_build_targets(
  buildsteps = cr_buildstep_targets_multi(),
  execute = c("trigger", "now"),
  path = "cloudbuild_targets.yaml",
  local = ".",
  predefinedAcl = "bucketLevel",
  bucket = cr_bucket_get(),
  download_folder = getwd(),
  ...
)

cr_build_targets_artifacts(
  build,
  bucket = cr_bucket_get(),
  target_folder = NULL,
  download_folder = NULL,
  target_subfolder = c("all", "meta", "objects", "user"),
  overwrite = TRUE
)

cr_buildstep_targets_single(
  target_folder = NULL,
  bucket = cr_bucket_get(),
  tar_config = NULL,
  task_image = "gcr.io/gcer-public/targets",
  task_args = NULL,
  tar_make = "targets::tar_make()"
)

cr_buildstep_targets_multi(
  target_folder = NULL,
  bucket = cr_bucket_get(),
  tar_config = NULL,
  task_image = "gcr.io/gcer-public/targets",
  task_args = NULL,
  last_id = NULL
)

Arguments

buildsteps

Generated buildsteps that create the targets build

execute

Whether to run the Cloud Build now or to write to a file for use within triggers or otherwise

path

File path to write the Google Cloud Build yaml workflow file. Set to NULL to write no file and just return the Yaml object.

local

If executing now, the local folder that will be uploaded as the context for the target build

predefinedAcl

The ACL rules for the object uploaded. Set to "bucketLevel" for buckets with bucket level access enabled

bucket

The Google Cloud Storage bucket the target metadata will be saved to in folder 'target_folder'

download_folder

Set to NULL to overwrite local _target folder: _targets/* otherwise will write to download_folder/_targets/*

...

Arguments passed on to cr_build_yaml, cr_build_yaml

steps

A vector of cr_buildstep

timeout

How long the entire build will run. If not set will be 10mins

logsBucket

Where logs are written. If you don't set this field, Cloud Build will use a default bucket to store your build logs.

options

A named list of options

substitutions

Build macros that will replace entries in other elements

tags

Tags for the build

secrets

A secrets object

images

What images will be build from this cloudbuild

artifacts

What artifacts may be built from this cloudbuild - create via cr_build_yaml_artifact

availableSecrets

What environment arguments from Secret Manager are available to the build - create via cr_build_yaml_secrets

serviceAccount

What service account should the build be run under?

build

A Build object that includes the artifact location

target_folder

Where target metadata will sit within the Google Cloud Storage bucket as a folder. If NULL defaults to RStudio project name or "targets_cloudbuild" if no RStudio project found.

target_subfolder

If you only want to download a specific folder from the _targets/ folder on Cloud Build then specify it here.

overwrite

Whether to overwrite existing local data

tar_config

An R script that will run before targets::tar_make() in the build e.g. "targets::tar_config_set(script = 'targets/_targets.R')"

task_image

An existing Docker image that will be used to run your targets workflow after the targets meta has been downloaded from Google Cloud Storage

task_args

A named list of additional arguments to send to cr_buildstep_r when its executing the tar_make command (such as environment arguments)

tar_make

The R script that will run in the tar_make() step. Modify to include custom settings such as "script"

last_id

The final buildstep that needs to complete before the upload. If left NULL then will default to the last tar_target step.

Details

Steps to set up your target task in Cloud Build:

  • Create your 'targets' workflow.

  • Create a Dockerfile that holds the R and system dependencies for your workflow. You can test the image using cr_deploy_docker. Include library(targets) dependencies - a Docker image with targets installed is available at gcr.io/gcer-public/targets.

  • Run cr_build_targets to create the cloudbuild yaml file.

  • Run the build via cr_build or similar. Each build should only recompute outdated targets.

  • Optionally create a build trigger via cr_buildtrigger.

  • Trigger a build. The first trigger will run the targets pipeline, subsequent runs will only recompute the outdated targets.

Use cr_build_targets_artifacts to download the return values of a target Cloud Build, then tar_read to read the results. You can set the downloaded files as the target store via targets::tar_config_set(store="_targets_cloudbuild"). Set download_folder = "_targets" to overwrite your local targets store.

Value

A Yaml object as generated by cr_build_yaml if execute="trigger" or the built object if execute="now"

cr_build_targets_artifacts returns the file path to where the download occurred.

DAGs

If your target workflow has parallel processing steps then leaving this as default cr_buildstep_targets_multi() will create a build that uses waitFor and build ids to create a DAG. Setting this to cr_buildstep_targets_single() will be single thread but you can then customise the targets::tar_make script. Or add your own custom target buildsteps here using cr_buildstep_targets - for example you could create the docker environment targets runs within before the main pipeline.

See Also

cr_buildstep_targets if you want to customise the build

Other Cloud Build functions: Build(), RepoSource(), Source(), StorageSource(), cr_build_artifacts(), cr_build_list(), cr_build_logs(), cr_build_make(), cr_build_status(), cr_build_upload_gcs(), cr_build_wait(), cr_build_write(), cr_build_yaml_artifact(), cr_build_yaml_secrets(), cr_build_yaml(), cr_build()

Examples


write.csv(mtcars, file = "mtcars.csv", row.names = FALSE)

targets::tar_script(
  list(
    targets::tar_target(file1,
      "mtcars.csv", format = "file"),
    targets::tar_target(input1,
      read.csv(file1)),
    targets::tar_target(result1,
      sum(input1$mpg)),
    targets::tar_target(result2,
      mean(input1$mpg)),
    targets::tar_target(result3,
      max(input1$mpg)),
    targets::tar_target(result4,
      min(input1$mpg)),
    targets::tar_target(merge1,
      paste(result1, result2, result3, result4))
    ),
 ask = FALSE)

bs <- cr_buildstep_targets_multi()

# only create the yaml
par_build <- cr_build_targets(bs, path = NULL)
par_build

# clean up example
unlink("mtcars.csv")
unlink("_targets.R")

## Not run: 
# run it immediately in cloud
cr_build_targets(bs, execute="now")

# create a yaml file for use in build triggers
cr_build_targets(bs)

## End(Not run)


googleCloudRunner documentation built on March 18, 2022, 8 p.m.