TABLE, DATA, files, bucket, runtime, and disk elements


DEFUNCT - AnVIL GCP functions for TABLE, DATA, files, bucket,

avtable_import_status() queries for the status of an 'asynchronous' table import.

avdata() returns key-value tables representing the information visualized under the DATA tab, 'REFERENCE DATA' and 'OTHER DATA' items. avdata_import() updates (modifies or creates new, but does not delete) rows in 'REFERENCE DATA' or 'OTHER DATA' tables.

avbucket() returns the workspace bucket, i.e., the google bucket associated with a workspace. Bucket content can be visualized under the 'DATA' tab, 'Files' item.

avfiles_ls() returns the paths of files in the workspace bucket. avfiles_backup() copies files from the compute node file system to the workspace bucket. avfiles_restore() copies files from the workspace bucket to the compute node file system. avfiles_rm() removes files or directories from the workspace bucket.

avruntimes() returns a tibble containing information about runtimes (notebooks or RStudio instances, for example) that the current user has access to.

avruntime() returns a tibble with the runtimes associated with a particular google project and account number; usually there is a single runtime satisfiying these criteria, and it is the runtime active in AnVIL.

'avdisks()' returns a tibble containing information about persistent disks associatd with the current user.


  n = Inf,
  page = 1L,
  pageSize = 1000L,
  sortField = "name",
  sortDirection = c("asc", "desc"),
  filterTerms = character(),
  filterOperator = c("and", "or"),
  namespace = avworkspace_namespace(),
  name = avworkspace_name(),
  na = c("", "NA")

  namespace = avworkspace_namespace(),
  name = avworkspace_name()

avdata(namespace = avworkspace_namespace(), name = avworkspace_name())

  namespace = avworkspace_namespace(),
  name = avworkspace_name()

  namespace = avworkspace_namespace(),
  name = avworkspace_name(),
  as_path = TRUE

  path = "",
  full_names = FALSE,
  recursive = FALSE,
  namespace = avworkspace_namespace(),
  name = avworkspace_name()

  destination = "",
  recursive = FALSE,
  parallel = TRUE,
  namespace = avworkspace_namespace(),
  name = avworkspace_name()

  destination = ".",
  recursive = FALSE,
  parallel = TRUE,
  namespace = avworkspace_namespace(),
  name = avworkspace_name()

  recursive = FALSE,
  parallel = TRUE,
  namespace = avworkspace_namespace(),
  name = avworkspace_name()


avruntime(project = gcloud_project(), account = gcloud_account())




character(1) table name as returned by, e.g., avtables().


numeric(1) maximum number of rows to return


integer(1) first page of iteration


integer(1) number of records per page. Generally, larger page sizes are more efficient.


character(1) field used to sort records when determining page order. Default is the entity field.


character(1) direction to sort entities ("asc"ending or "desc"ending) when paging.


character(1) string literal to select rows with an exact (substring) matches in column.


character(1) operator to use when multiple terms in ⁠filterTerms=⁠, either "and" (default) or "or".


character(1) AnVIL workspace namespace as returned by, e.g., avworkspace_namespace()


character(1) AnVIL workspace name as returned by, eg., avworkspace_name().


in avtable() and avtable_paged(), character() of strings to be interpretted as missing values. In avtable_import() character(1) value to use for representing NA_character_. See Details.


tibble() of job identifiers, returned by avtable_import() and avtable_import_set().


A tibble or data.frame for import as an AnVIL table.


logical(1) when TRUE (default) return bucket with prefix ⁠gs://⁠ (for avbucket()) or ⁠gs://<bucket-id>⁠ (for avfiles_ls()).


For ⁠avfiles_ls(), the character(1) file or directory path to list. For ⁠avfiles_rm()⁠, the character() (perhaps with length greater than 1) of files or directory paths to be removed. The elements of ⁠path⁠can contain glob-style patterns, e.g.,⁠vign*'.


logical(1) return names relative to path (FALSE, default) or root of the workspace bucket?


logical(1) list files recursively?


character() file paths. for avfiles_backup(), source can include directory names when recursive = TRUE.


character(1) a google bucket (⁠gs://<bucket-id>/...⁠) to write files. The default is the workspace bucket.


logical(1) backup files using parallel transfer? See ?gsutil_cp().


character(1) project (billing account) name, as returned by, e.g., gcloud_project() or avworkspace_namespace().


character(1) google account (email address associated with billing account), as returned by gcloud_account().


avfiles_backup() can be used to back-up individual files or entire directories, recursively. When recursive = FALSE, files are backed up to the bucket with names approximately paste0(destination, "/", basename(source)). When recursive = TRUE and source is a directory ⁠path/to/foo/', files are backed up to bucket names that include the directory name, approximately ⁠paste0(destination, "/", dir(basename(source), full.names = TRUE))⁠. Naming conventions are described in detail in ⁠gsutil_help("cp")'.

avfiles_restore() behaves in a manner analogous to avfiles_backup(), copying files from the workspace bucket to the compute node file system.


avtable_paged(): a tibble of data corresponding to the AnVIL table table in the specified workspace.

avdata() returns a tibble with five columns: "type" represents the origin of the data from the 'REFERENCE' or 'OTHER' data menus. "table" is the table name in the REFERENCE menu, or 'workspace' for the table in the 'OTHER' menu, the key used to access the data element, the value label associated with the data element and the value (e.g., google bucket) of the element.

avdata_import() returns, invisibly, the subset of the input table used to update the AnVIL tables.

avbucket() returns a character(1) bucket identifier, prefixed with ⁠gs://⁠ if as_path = TRUE.

avfiles_ls() returns a character vector of files in the workspace bucket.

avfiles_backup() returns, invisibly, the status code of the gsutil_cp() command used to back up the files.

avfiles_rm() on success, returns a list of the return codes of gsutil_rm(), invisibly.

avruntimes() returns a tibble with columns

  • id: integer() runtime identifier.

  • googleProject: character() billing account.

  • tool: character() e.g., "Jupyter", "RStudio".

  • status character() e.g., "Stopped", "Running".

  • creator character() AnVIL account, typically "user@gmail.com".

  • createdDate character() creation date.

  • destroyedDate character() destruction date, or NA.

  • dateAccessed character() date of (first?) access.

  • runtimeName character().

  • clusterServiceAccount character() service ('pet') account for this runtime.

  • masterMachineType character() It is unclear which 'tool' populates which of the machineType columns).

  • workerMachineType character().

  • machineType character().

  • persistentDiskId integer() identifier of persistent disk (see avdisks()), or NA.

avruntime() returns a tibble witht he same structure as the return value of avruntimes().

avdisks() returns a tibble with columns

  • id character() disk identifier.

  • googleProject: character() billing account.

  • status, e.g, "Ready"

  • size integer() in GB.

  • diskType character().

  • blockSize integer().

  • creator character() AnVIL account, typically "user@gmail.com".

  • createdDate character() creation date.

  • destroyedDate character() destruction date, or NA.

  • dateAccessed character() date of (first?) access.

  • zone character() e.g.. "us-central1-a".

  • name character().

