Description Usage Arguments Details Value Examples
avworkspace_namespace()
and avworkspace_name()
are
utiliity functions to retrieve workspace namespace and name
from environment variables or interfaces usually available in
AnVIL notebooks or RStudio sessions. avworkspace()
provides
a convenient way to specify workspace namespace and name in a
single command.
'avtables()' describes tables available in a workspace. Tables can be visualized under the DATA tab, TABLES item. 'avtable()' returns an AnVIL table. 'avtable_import()' imports a data.frame to an AnVIL table. 'avtable_import_set()' imports set membership (i.e., a subset of an existing table) information to an AnVIL table. 'avtable_delete_values()' removes rows from an AnVIL table.
'avdata()' returns key-value tables representing the information visualized under the DATA tab, 'REFERENCE DATA' and 'OTHER DATA' items.
'avbucket()' returns the workspace bucket, i.e., the google bucket associated with a workspace. Bucket content can be visualized under the 'DATA' tab, 'Files' item.
'avfiles_ls()' returns the paths of files in the workspace bucket. 'avfiles_backup()' copies files from the compute node file system to the workspace bucket. 'avfiles_restore()' copies files from the workspace bucket to the compute node file system. 'avfiles_rm()' removes files or directories from the workspace bucket.
avruntimes()
returns a tibble containing information
about runtimes (notebooks or RStudio instances, for example)
that the current user has access to.
'avdisks()' returns a tibble containing information about persistent disks associatd with the current user.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | avworkspace_namespace(namespace = NULL)
avworkspace_name(name = NULL)
avworkspace(workspace = NULL)
avtables(namespace = avworkspace_namespace(), name = avworkspace_name())
avtable(table, namespace = avworkspace_namespace(), name = avworkspace_name())
avtable_import(
.data,
entity = names(.data)[[1]],
namespace = avworkspace_namespace(),
name = avworkspace_name()
)
avtable_import_set(
.data,
origin,
set = names(.data)[[1]],
member = names(.data)[[2]],
namespace = avworkspace_namespace(),
name = avworkspace_name()
)
avtable_delete_values(
table,
values,
namespace = avworkspace_namespace(),
name = avworkspace_name()
)
avdata(namespace = avworkspace_namespace(), name = avworkspace_name())
avbucket(
namespace = avworkspace_namespace(),
name = avworkspace_name(),
as_path = TRUE
)
avfiles_ls(
path = "",
full_names = FALSE,
recursive = FALSE,
namespace = avworkspace_namespace(),
name = avworkspace_name()
)
avfiles_backup(
source,
destination = "",
recursive = FALSE,
parallel = TRUE,
namespace = avworkspace_namespace(),
name = avworkspace_name()
)
avfiles_restore(
source,
destination = ".",
recursive = FALSE,
parallel = TRUE,
namespace = avworkspace_namespace(),
name = avworkspace_name()
)
avfiles_rm(
source,
recursive = FALSE,
parallel = TRUE,
namespace = avworkspace_namespace(),
name = avworkspace_name()
)
avruntimes()
avdisks()
|
namespace |
character(1) AnVIL workspace namespace as returned
by, e.g., |
name |
character(1) AnVIL workspace name as returned by, eg.,
|
workspace |
when present, a 'character(1)' providing the concatenated namespace and name, e.g., '"bioconductor-rpci-anvil/Bioconductor-Package-AnVIL"' |
table |
character(1) table name as returned by, e.g., 'avtables()'. |
.data |
A tibble or data.frame for import as an AnVIL table. |
entity |
'character(1)' column name of '.data' to be used as imported table name. When the table comes from R, this is usually a column name such as 'sample'. The data will be imported into AnVIL as a table 'sample', with the 'sample' column included with suffix '_id', e.g., 'sample_id'. A column in '.data' with suffix '_id' can also be used, e.g., 'entity = "sample_id"', creating the table 'sample' with column 'sample_id' in AnVIL. Finally, a value of 'entity' that is not a column in '.data', e.g., 'entity = "unknown"', will cause a new table with name 'entity' and entity values 'seq_len(nrow(.data))'. |
origin |
character(1) name of the entity (table) used to create the set e.g "sample", "participant", etc. |
set |
'character(1)' column name of '.data' identifying the set(s) to be created. |
member |
'character()' vector of entity from the avtable identified by 'origin'. The values may repeat if an ID is in more than one set |
values |
vector of values in the entity (key) column of 'table' to be deleted. A table 'sample' has an associated entity column with suffix '_id', e.g., 'sample_id'. Rows with entity column entries matching 'values' are deleted. |
as_path |
logical(1) when TRUE (default) return bucket with prefix 'gs://' (for 'avbucket()') or 'gs://<bucket-id>' (for 'avfiles_ls()'). |
path |
For 'avfiles_ls(), the character(1) file or directory path to list. For 'avfiles_rm()', the character() (perhaps with length greater than 1) of files or directory paths to be removed. The elements of 'path' can contain glob-style patterns, e.g., 'vign*'. |
full_names |
logical(1) return names relative to 'path' ('FALSE', default) or root of the workspace bucket? |
recursive |
logical(1) list files recursively? |
source |
character() file paths. for 'avfiles_backup()', 'source' can include directory names when 'recursive = TRUE'. |
destination |
character(1) a google bucket ('gs://<bucket-id>/...') to write files. The default is the workspace bucket. |
parallel |
logical(1) backup files using parallel transfer? See '?gsutil_cp()'. |
avworkspace_namespace()
is the billing account. If the
namespace=
argument is not provided, try gcloud_project()
,
and if that fails try Sys.getenv("WORKSPACE_NAMESPACE")
.
1 2 3 4 5 6 7 8 | `avworkspace_name()` is the name of the workspace as it appears
in \url{https://app.terra.bio/#workspaces}. If not provided,
`avworkspace_name()` tries to use
`Sys.getenv("WORKSPACE_NAME")`.
Values are cached across sessions, so explicitly providing
`avworkspace_*()` is required at most once per session. Revert
to system settings with arguments `NA`.
|
'avtable_import_set()' creates new rows in a table '<origin>_set'. One row will be created for each distinct value in the column identified by 'set'. Each row entry has a corresponding column '<origin>' linking to one or more rows in the '<origin>' table, as given in the 'member' column. The operation is somewhat like 'split(member, set)'.
'avfiles_backup()' can be used to back-up individual files or entire directories, recursively. When 'recursive = FALSE', files are backed up to the bucket with names approximately 'paste0(destination, "/", basename(source))'. When 'recursive = TRUE‘ and source is a directory 'path/to/foo/’, files are backed up to bucket names that include the directory name, approximately 'paste0(destination, "/", dir(basename(source), full.names = TRUE))'. Naming conventions are described in detail in 'gsutil_help("cp")'.
'avfiles_restore()' behaves in a manner analogous to 'avfiles_backup()', copying files from the workspace bucket to the compute node file system.
avworkspace_namespace()
, and avworkspace_name()
return
character(1)
identifiers.
'avworkspace()' returns the character(1) concatenated namespace and name.
'avtables()': A tibble with columns identifying the table, the number of records, and the column names.
'avtable()': a tibble of data corresponding to the AnVIL table 'table' in the specified workspace.
'avtable_import()' returns a 'character(1)' name of the imported AnVIL tibble.
'avtable_import_set()' returns a 'character(1)' name of the imported AnVIL tibble.
'avtable_delete_values()' returns a 'tibble' representing deleted entities, invisibly.
'avdata()' returns a tibble with five columns: '"type"' represents the origin of the data from the 'REFERENCE' or 'OTHER' data menus. '"table"' is the table name in the ‘REFERENCE' menu, or ’workspace' for the table in the 'OTHER' menu, the key used to access the data element, the value label associated with the data element and the value (e.g., google bucket) of the element.
'avbucket()' returns a 'character(1)' bucket identifier, prefixed with 'gs://' if 'as_path = TRUE'.
'avfiles_ls()' returns a character vector of files in the workspace bucket.
'avfiles_backup()' returns, invisibly, the status code of the 'gsutil_cp()' command used to back up the files.
'avfiles_rm()' on success, returns a list of the return codes of 'gsutil_rm()', invisibly.
avruntimes()
returns a tibble with columns
id: integer() runtime identifier.
googleProject: character() billing account.
tool: character() e.g., "Jupyter", "RStudio".
status character() e.g., "Stopped", "Running".
creator character() AnVIL account, typically "user@gmail.com".
createdDate character() creation date.
destroyedDate character() destruction date, or NA.
dateAccessed character() date of (first?) access.
runtimeName character().
clusterServiceAccount character() service ('pet') account for this runtime.
masterMachineType character() It is unclear which 'tool' populates which of the machineType columns).
workerMachineType character().
machineType character().
persistentDiskId integer() identifier of persistent disk (see
avdisks()
), or NA
.
avdisks()
returns a tibble with columns
id character() disk identifier.
googleProject: character() billing account.
status, e.g, "Ready"
size integer() in GB.
diskType character().
blockSize integer().
creator character() AnVIL account, typically "user@gmail.com".
createdDate character() creation date.
destroyedDate character() destruction date, or NA.
dateAccessed character() date of (first?) access.
zone character() e.g.. "us-central1-a".
name character().
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | avworkspace_namespace()
avworkspace_name()
avworkspace()
## Not run:
## editable copy of '1000G-high-coverage-2019' workspace
avworkspace("bioconductor-rpci-anvil/1000G-high-coverage-2019")
sample <-
avtable("sample") %>% # existing table
mutate(set = sample(head(LETTERS), nrow(.), TRUE)) # arbitrary groups
sample %>% # new 'participant_set' table
avtable_import_set("participant", "set", "participant")
sample %>% # new 'sample_set' table
avtable_import_set("sample", "set", "name")
## End(Not run)
if (gcloud_exists() && nzchar(avworkspace_name()))
## from within AnVIL
avdata()
if (gcloud_exists() && nzchar(avworkspace_name()))
## From within AnVIL...
bucket <- avbucket() # discover bucket
## Not run:
path <- file.path(bucket, "mtcars.tab")
gsutil_ls(dirname(path)) # no 'mtcars.tab'...
write.table(mtcars, gsutil_pipe(path, "w")) # write to bucket
gsutil_stat(path) # yep, there!
read.table(gsutil_pipe(path, "r")) # read from bucket
## End(Not run)
if (gcloud_exists() && nzchar(avworkspace_name()))
avfiles_ls()
## Not run:
## backup all files in the current directory
## default buckets are gs://<bucket-id>/<file-names>
avfiles_backup(dir())
## backup working directory, recursively
## default buckets are gs://<bucket-id>/<basename(getwd())>/...
avfiles_backup(getwd(), recursive = TRUE)
## End(Not run)
if (gcloud_exists())
## from within AnVIL
avruntimes()
if (gcloud_exists())
## from within AnVIL
avdisks()
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.