knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE )
Seven Bridges platforms provide a few different methods for data import:
sevenbridges2
packageIn this chapter we will explain how you can use the sevenbridges2
API library
to upload your files to the Platform.
Although it is more intuitive to have these operations available on the File
object, they are separated and stored directly on the authentication object
Auth
, because there are a separate group of endpoints themselves.
You can upload files from your local computer to the Platform using the
upload()
method on your Auth
object. The method allows you to upload only
a single file for now.
To upload a file, you should provide its full path on your local computer
as the path
parameter.
To specify the upload destination for your file you can use either project
or
parent
parameter. These two parameters should not be used together.
Project
object or project ID.File
object (of type Folder
) or its ID.By calling the upload()
method you are creating an upload job that by
default starts to run immediately.
If you don't want to start the job immediately, just set the init
parameter
to TRUE
in order to only initialize the object.
This upload job is wrapped into an object of the class Upload
where you can
see its details and call other actions on it.
Let's initialize an upload job that will upload a file into a project:
# Authenticate a <- Auth$new(platform = "aws-us", token = "<your-token>") # Get the desired project to upload to destination_project <- a$projects$get(project = "<project_id>") # Create upload job and set destination project upload_job <- a$upload( path = "/path/to/your/file.txt", project = destination_project, overwrite = TRUE, init = TRUE )
If you would like to upload your file into a folder, you need to set the
parent
parameter:
# Get destination folder object destination_folder <- a$files$get(id = "<folder_id>") up <- a$upload( path = "/path/to/your/file.txt", parent = destination_folder, overwrite = TRUE, init = TRUE )
Since we have initialized the upload job, let's see which actions can we run.
First, let's print the Upload
object to see what the API returned as the
response.
up$print()
── Upload ───────────────────────────────────────────────────────────────────── • initialized: TRUE • part_length: 1 • part_size: 33554432 • file_size: 232 • overwrite: FALSE • filename: file.txt • project: <username_or_division>/api-testing • path: /path/to/your/file.txt • upload_id: 4OvRx8Z9vghNoAUqsgYtNuM2IsiIM8kghhjgi7igu79HX9QKZpDEh5TZDrmhPxF
In the previous example we can see that the API returned the upload id and some
information about sizes.
First we see the file_size
in bytes (232), which is the real size of the file.
File upload actually splits files into parts in the background; parts are then
being uploaded one by one or in parallel and then merged again on destination.
Each part can weigh a maximum of 5 GB, while the default part_size
is
recommended and set to be 32MB (which is 33554432B in our example).
Lastly, number of parts or part_length
field, is also an important measure.
Maximum number of parts can be 10.000.
Since users can control part size through the part_size
parameter
in upload()
function, they should be careful not to set a size that is too
small for very large files, so that total number of parts doesn’t exceed the
limit of maximum 10.000.
Call the start()
method on the upload job object do start the upload process.
# Start upload up$start()
If you want to skip the step where you need to call the start()
method
to start the actual upload process, just set the init
parameter back to
FALSE
when creating the upload job and the upload process will start right
away.
# Create upload job and start it immediately up <- a$upload( path = "/path/to/your/file.txt", project = destination_project, overwrite = TRUE, init = FALSE )
In order to track the progress of the job, you can call the info()
method
on the upload object.
# Get upload progress info up$info()
Apart from basic information, the result will also provide the info on the number of uploaded parts up to that moment.
Going back to the authentication object, there are two more operations for
uploads manipulation.
One is the method list_ongoing_uploads()
that allows you to see the list of
all ongoing upload processes.
# List ongoing uploads a$list_ongoing_uploads()
The other one is abort()
.
You can abort any upload process using the upload_abort()
method.
To do so, you need to provide the ID of a process within the upload_id
parameter.
# Abort upload a$abort_upload(upload_id = "<id_of_the_upload_process>")
Note that in practice, if you start a big upload job, your R session will be blocked until this process is finished. This functionality is work in progress but the idea is to not block your main session in the future, while the upload is running. For now, you can create another R session on your own and track the progress of the upload job there.
Cloud storage providers come with their own interfaces, features, and terminology. At a certain level, though, they all view resources as data objects organized in repositories. Authentication and operations are commonly defined on those objects and repositories, and while each cloud provider might call these things different names and apply different parameters to them, their basic behavior is the same.
Seven Bridges environments mediate access to these repositories using volumes. A volume is associated with a particular cloud storage repository that you have enabled Seven Bridges to read from (and, optionally, to write to). Currently, volumes may be created using two types of cloud storage repositories: Amazon Web Services' (AWS) S3 buckets and Google Cloud Storage (GCS) buckets.
A volume enables you to treat the cloud repository associated with it as external storage. You can 'import' files from the volume to your Seven Bridges environment to use them as inputs for computation. Similarly, you can write files from the Seven Bridges environment to your cloud storage by 'exporting' them to your volume.
Learn more about volumes on the Seven Bridges Platform, CGC, BDC and CAVATICA.
All volume related operations for querying volumes, fetching a single volume,
and creating volumes are grouped under volumes
path (Volumes
resource class) on the authentication object.
When operating with a single volume, it is represented as an object of the
Volume
class which stores all volume information returned from the API and
additional methods you are able to call directly on the volume, like volume
update, deactivation, listing content, volume members management etc.
Note that all operations with volumes require advance_access
parameter
to be set to TRUE. In most of the volume operations it is pre-set to TRUE
by
default.
You can list all volumes you've registered by calling the volumes$query()
method from the authentication object. The method doesn't have any additional
query parameters that could allow you to search for volumes by specific
criteria, except the ones that control the number of results returned using
limit
and offset
parameters.
# Query volumes a$volumes$query()
The result returned is the Collection
object with pagination ability.
In order to retrieve information about a single volume of interest, you can get
it using the volumes$get()
method using its id as parameter.
Volume ID is usually presented in the <division_name>/<volume_name>
form for
Enterprise users, while for public program users it can be in the
<volume_owner>/<volume_name>
form.
# Get volume a$volumes$get(id = "<volume_owner_or_division>/<volume_name>")
For creating volumes we have exposed several functions for different cloud providers and authentication types:
create_s3_using_iam_user
: creates S3 volume using IAM User authentication
typecreate_s3_using_iam_role
: creates S3 volume using IAM Role authentication
typecreate_google_using_iam_user
: creates GC volume using IAM User
authentication typecreate_google_using_iam_role
: creates GC volume using IAM Role
authentication typecreate_azure
: creates Azure volume (only RO privileges allowed)create_ali_oss
: creates AliCloud volume (only RO privileges allowed)For each of the functions it is possible to provide parameters via path
(from_path
) to a JSON file where all required fields should be listed.
Examples of use are shown below:
# Create AWS volume using IAM User authentication type aws_iam_user_volume <- a$volumes$create_s3_using_iam_user( name = "my_new_aws_user_volume", bucket = "<bucket-name>", description = "AWS IAM User volume", access_key_id = "<access-key>", secret_access_key = "<secret-access-key>" ) aws_iam_user_volume_from_path <- a$volumes$create_s3_using_iam_user( from_path = "path/to/my/json/file.json" ) # Create AWS volume using IAM Role authentication type aws_iam_role_volume <- a$volumes$create_s3_using_iam_role( name = "my_new_aws_role_volume", bucket = "<bucket-name>", description = "AWS IAM Role volume", role_arn = "<role-arn-key>", external_id = "<external-id>" ) aws_iam_role_volume_from_path <- a$volumes$create_s3_using_iam_role( from_path = "path/to/my/json/file.json" ) # Create Google Cloud volume using IAM User authentication type gc_iam_user_volume <- a$volumes$create_google_using_iam_user( name = "my_new_gc_user_volume", access_mode = "RW", bucket = "<bucket-name>", description = "GC IAM User volume", client_email = "<client_email>", private_key = "<private_key-string>" ) gc_iam_user_volume_from_path <- a$volumes$create_google_using_iam_user( from_path = "path/to/my/json/file.json" ) # Create Google Cloud volume using IAM Role authentication type # by passing configuration parameter as named list gc_iam_role_volume <- a$volumes$create_google_using_iam_role( name = "my_new_gc_role_volume", access_mode = "RO", bucket = "<bucket-name>", description = "GC IAM Role volume", configuration = list( type = "<type-name>", audience = "<audience-link>", subject_token_type = "<subject_token_type>", service_account_impersonation_url = "<service_account_impersonation_url>", token_url = "<token_url>", credential_source = list( environment_id = "<environment_id>", region_url = "<region_url>", url = "<url>", regional_cred_verification_url = "<regional_cred_verification_url>" ) ) ) # Create Google Cloud volume using IAM Role authentication type # by passing configuration parameter as string path to configuration file gc_iam_role_volume_config_file <- a$volumes$create_google_using_iam_role( name = "my_new_gc_role_volume_cnf_file", access_mode = "RO", bucket = "<bucket-name>", description = "GC IAM Role volume - using config file", configuration = "path/to/config/file.json" ) # Create Google Cloud volume using IAM Role authentication type # using from_path parameter gc_iam_role_volume_from_path <- a$volumes$create_google_using_iam_role( from_path = "path/to/full/config/file.json" ) # Create Azure volume azure_volume <- a$volumes$create_azure( name = "my_new_azure_volume", description = "Azure volume", endpoint = "<endpoint>", container = "<bucket-name", storage_account = "<storage_account-name>", tenant_id = "<tenant_id>", client_id = "<client_id>", client_secret = "<client_secret>", resource_id = "<resource_id>" ) azure_volume_from_path <- a$volumes$create_azure( from_path = "path/to/my/json/file.json" ) # Create Ali Cloud volume ali_volume <- a$volumes$create_ali_oss( name = "my_new_azure_volume", description = "Ali volume", endpoint = "<endpoint>", bucket = "<bucket-name", access_key_id = "<access_key_id>", secret_access_key = "<secret_access_key>" ) ali_volume_from_path <- a$volumes$create_ali_oss( from_path = "path/to/my/json/file.json" )
When you've created a new volume, you can notice it is represented as an object
of the Volume
class. To preview all volume information, use the print()
method:
# Print volume info print(aws_iam_user_volume)
Within this volume you have the following operations available to execute:
update
: update volume informationlist_contents
: list volume contentget_file
: get single volume file infodeactivate
: deactivate volumereactivate
: reactivate previously deactivated volumelist_members
: list all volume membersadd_member
: add new volume memberremove_member
: remove volume memberget_member
: get a volume member informationmodify_member_permissions
: modify member permissions on the volumedelete
: delete previously deactivated volumereload
: reload volume object to sync informationlist_imports
: list all imports from the specified volumelist_exports
: list all exports to the specified volumeYou can update volume's description
, access_mode
and service
information.
Please consult our API documentation
on how to use the service
parameter.
# If the volume is created with RO access mode and RO credential parameters, # and now we want to change it to RW, we should also set proper credential # parameters that are connected to the RW user on the bucket. # If it's created with RW credentials, but access mode is set to RO, then no # change is needed in the credentials parameters. aws_iam_user_volume$update( description = "Updated to RW", access_mode = "RW", service = list( credentials = list( access_key_id = "<access_key_id_for_rw>", secret_access_key = "<secret_access_key_for_rw>", ) ) )
To keep your local Volume
object up to date with the volume on the platform,
you can always call the reload()
function:
# Reload volume object aws_iam_user_volume$reload()
This operation lists all volume files in the root directory of the bucket,
unless the parent
parameter is specified. In that case, it lists the content
of that directory on the bucket.
The output is a VolumeContentCollection
collection object, that contains
two fields:
-items
for storing a list of VolumeFile
objects (files on the
volume) and
-prefixes
for storing a list of VolumePrefix
objects or folders on the
volume.
You can also specify the limit
parameter to control the number of results
returned.
Same as Collection
objects, here we also have pagination functions to
return either the next page of results or all results. However, backward
pagination is not available for volume contents.
Users can also navigate through pages of results by using the
continuation token
parameter or link
to fetch the next chunk of results.
If you use the link
parameter, it will overwrite all other parameters if set,
since it already contains the limit
and continuation_token
info.
# List all files in root bucket directory content_collection <- aws_iam_user_volume$list_contents(limit = 20) # Print collection content_collection # List all files from a specific directory on the bucket folder_files_collection <- aws_iam_user_volume$list_contents( prefix = "<directory_name>" ) # Get the next group of results by setting the continuation token content_collection <- aws_iam_user_volume$list_contents( limit = 20, continuation_token = "<continuation_token>" ) # Preview volume files content_collection$items # Preview volume prefixes/folders content_collection$prefixes # Preview links aws_iam_user_volume$links # Get the next group of results by setting the link parameter aws_iam_user_volume$list_contents(link = "<link_to_next_results>") # Or use VolumeContentCollection object's next_page() method for this: content_collection$next_page() # You can also fetch all results with the all() method content_collection$all()
Volume files and prefixes are also treated as objects and they contain some operations that can be called on them.
This operation returns a single volume file information. The input parameter
can be file's id which is represented as location on the bucket
(location)
, or a link to that file resource. The link is a href
field
of the desired file received from the response when returning a list of volume
contents with list_contents()
.
Empty arguments are not allowed along with setting both parameters together.
# Get single volume file info - by setting file_location vol_file1 <- aws_iam_user_volume$get_file( location = "<file_location_on_bucket>" ) # Get single volume file info - by setting link vol_file1 <- aws_iam_user_volume$get_file(link = "full/request/link/to/file")
To keep your local VolumeFile object up to date with the volume file on the
platform, you can always call the reload()
function:
vol_file1$reload()
Unfortunately we don't have a separate operation to fetch only prefixes on the
volume, therefore, we can get its prefixes only by using the list_contents()
operation and look for the prefixes
field in the returned
VolumeContentCollection
object.
# List content volume_content <- aws_iam_user_volume$list_contents() # Extract prefixes volume_prefixes <- volume_content$prefixes # Select one of the volume folders to list its content volume_folder <- volume_prefixes[[1]] # Print volume prefix information volume_folder$print()
You can also list the content of a volume prefix/folder on the volume, by
calling list_contents()
directly on the VolumePrefix
object.
## Select one of the volume folders to list its content volume_folder <- volume_prefixes[[1]] # List content volume_folder_content <- volume_folder$list_contents()
In order to fetch members of one volume or a specific member by its username,
you can use list_members()
and get_member()
operations:
# List volume members aws_iam_user_volume$list_members() # limit = 2 # Get single member aws_iam_user_volume$get_member(user = "<member-username>")
Volume admins can remove volume members by providing its username or object of
the Member
class to the remove_member()
function:
# Remove member aws_iam_user_volume$remove_member("<member-username>") # Remove member using the Member object members <- aws_iam_user_volume$list_members() aws_iam_user_volume$remove_member(members$items[[3]])
The function for adding new members to the volume can accept a Member
object
(for example used in a project) or its username.
# Add member via username aws_iam_user_volume$add_member(user = "<member-username>", permissions = list( read = TRUE, copy = TRUE, write = FALSE, admin = FALSE )) # Add member via Member object aws_iam_user_volume$add_member( user = Member$new( username = "<member-username>", id = "<member-username>" ), permissions = list( read = TRUE, copy = TRUE, write = FALSE, admin = FALSE ) )
Users can modify specific member's permissions on the volume by providing the privileges they want to change:
# Modify member permissions aws_iam_user_volume$modify_member_permissions( user = "<member-username>", permissions = list(write = TRUE) )
Once deactivated, you cannot import from, export to, or browse within a volume.
As such, the content of the files imported from this volume will no longer be
accessible on the platform. However, you can update the volume and manage
members.
Note that you cannot deactivate the volume if you have running imports or
exports unless you force the operation using the query parameter force = TRUE
.
Note that to delete a volume, first you must deactivate it and delete all files which have been imported from the volume to the platform.
To reactivate the volume, just use the reactivate()
function.
# Deactivate volume aws_iam_user_volume$deactivate() # Reactivate volume aws_iam_user_volume$reactivate()
To be able to delete a volume, you first need to deactivate it and then delete all files on the Platform that were previously imported from the volume.
# Deactivate volume aws_iam_user_volume$deactivate() # Delete volume aws_iam_user_volume$delete()
Creating and connecting volumes to the Platform allows you to import your
files/folders from a cloud bucket to the Platform. Imports operations are
related to volumes, but in the API they are separated under /imports
endpoints, so in our library they are also grouped under imports path on the
authentication object (Imports
resource class).
A single import job is represented as an Import
class object containing
information about which file/folder has been or is being imported, from which
volume, to which project/folder on the platform, import start and finish time,
status of the job, logs etc.
To preview and query all import jobs you've created use the query()
function
on Auth$imports
path:
# List imports all_imports <- a$imports$query() # Limit results to 5 imp_limit5 <- a$imports$query(limit = 5) # Load next page of 5 results imp_limit5$next_page(advance_access = TRUE) # Load all results at once until last page imp_limit5$all(advance_access = TRUE)
It is possible to use some query parameters as different criteria for filtering
results like volume
, project
, state
etc:
# List imports with state being RUNNING or FAILED imp_states <- auth$imports$query(state = c("RUNNING", "FAILED")) # List imports to the specific project imp_project <- auth$imports$query(project = "<project_id>")
Listing imports is also available within Project
and Volume
objects, where
resulting imports are related to the specific project or volume where they're
called from.
## Get the volume from which you want to list all imports vol1 <- auth$volumes$get(id = "<volumes_owner_or_division>/<volume-name>") vol1$list_imports() ## Get the project object for which you want to list imports test_proj <- auth$projects$get("<project_id>") test_proj$list_imports()
Similar to other resource classes, the get()
method will return a single
import job object when provided with a job id.
# Get single import imp_obj <- a$imports$get(id = "<import_job_id>")
Users are able to fetch details for multiple import jobs by calling one bulk
action - the bulk_get()
method. The accepted input can be a list of import
job IDs or a list of import job objects (of class Import).
The result will be a Collection object containing a list of (updated)
import jobs.
# Get details of multiple import jobs import_jobs <- a$imports$bulk_get( imports = list("<import_job_id-1>", "<import_job_id-1>") )
In order to import volume files into a project, users can use the
submit_import()
method from the Auth$imports
path, or directly on the
selected VolumeFile
object (file they want to import) where this function is
also available.
## First, get the volume you want to import files from vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>") ## Then, get the project object/id where you want to import files test_proj <- a$projects$get("<project_id>") ## List all volume files on the volume vol1_content <- vol1$list_contents() ## Select one of the volume files volume_file_import <- vol1_content$items[[3]] ## Perform a file import imp_job1 <- a$imports$submit_import( source_location = volume_file_import$location, destination_project = test_proj, autorename = TRUE ) # Alternatively you can also call import() directly on the VolumeFile object imp_job1 <- volume_file_import$import( destination_project = test_proj, autorename = TRUE )
Preview import job details with the print()
method:
# Print Import object print(imp_job1)
You can also import folders from the volume into the project, with the option to preserve or not to preserve folder structure:
# Select one of the volume folders to import volume_folder_import <- vol1_content$prefixes[[1]] # Perform a folder import imp_job2 <- a$imports$submit_import( source_location = volume_folder_import$prefix, destination_project = test_proj, overwrite = TRUE, preserve_folder_structure = TRUE ) # Alternatively you can also call import() directly on the VolumePrefix object imp_job2 <- volume_folder_import$import( destination_project = test_proj, overwrite = TRUE, preserve_folder_structure = TRUE ) # Print Import object print(imp_job2)
In order to refresh the import job object and get the up to date info
about its state, you can always call the reload()
function:
# Reload import object imp_job1$reload()
Users are able to perform the bulk action on multiple volume files or folders
and import them into a project using a single call of the
bulk_submit_import()
method from the Auth$imports
path.
The required input should be a nested list of elements for each file/folder you
want to import containing specific fields: source_volume
, source_location
,
destination_project
, destination_parent
, name
, autorename
and
perserve_folder_structure.
## First, get the volume you want to import files from vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>") ## Then, get the project object or the ID of the project into which you want # to import files test_proj <- a$projects$get("<project_id>") ## List all volume files vol1_content <- vol1$list_contents() ## Preview the content and select one VolumeFile object and two VolumePrefix ## objects (folders) for the purpose of this example volume_file_import <- vol1_content$items[[1, 2]] volume_file_import volume_folder_import <- vol1_content$prefixes[[1]] volume_folder_import ## Construct the inputs list by filling the necessary information for each # file/folder to import to_import <- list( list( source_volume = "rfranklin/my-volume", source_location = "chimeras.html.gz", destination_project = "rfranklin/my-project" ), list( source_volume = vol1, source_location = "my-folder/", destination_project = test_proj, autorename = TRUE, preserve_folder_structure = TRUE ), list( source_volume = "rfranklin/my-volume", source_location = "my-volume-folder/", destination_parent = "parent-id", name = "new-folder-name", autorename = TRUE, preserve_folder_structure = FALSE ) ) bulk_import_jobs <- a$imports$bulk_submit_import(items = to_import) # Preview the results bulk_import_jobs # Get updated status by fetching details with bulk_get() and by passing the # list of import jobs created in the previous step a$imports$bulk_get(imports = bulk_import_jobs$items)
As you may see from the example above, users are able to import folders from the volume into the project or a project directory, with the option to preserve or not to preserve folder structure.
Moreover, you are able to pass the objects of classes Volume, Project or File
(with type = 'folder') for source_volume
, destination_project
and
destination_parent
fields when constructing the inputs list, besides using
their string IDs.
In the example above, the list of items for the bulk import job was manually
created. Alternatively, you can use the prepare_items_for_bulk_import()
utility function to generate the items list.
This function allows you to prepare the list of bulk import items based on the
provided VolumeFile
or VolumePrefix
objects, filling the following fields
for each item: source_volume
, source_location
, destination_project
or
destination_parent
, autorename
, preserve_folder_structure
.
Note that the same destination_project
/destination_parent
and autorename
values will be applied uniformly across all items in the resulting list.
The preserve_folder_structure
parameter, if provided, applies exclusively to
VolumePrefix
items.
## First, get the volume you want to import files from vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>") ## Then, get the project object or the ID of the project into which you want # to import files test_proj <- a$projects$get("<project_id>") ## List all volume files vol1_content <- vol1$list_contents() ## Select two VolumeFile objects volume_file_1_import <- vol1_content$items[[1]] volume_file_2_import <- vol1_content$items[[2]] volume_files_to_import <- list(volume_file_1_import, volume_file_2_import) ## Construct the inputs list using the prepare_items_for_bulk_import() utility # function to_import <- prepare_items_for_bulk_import( volume_items = volume_files_to_import, destination_project = test_proj ) bulk_import_jobs <- a$imports$bulk_submit_import(items = to_import) # Preview the results bulk_import_jobs # Get updated status by fetching details with bulk_get() and by passing the # list of import jobs created in the previous step a$imports$bulk_get(imports = bulk_import_jobs$items)
Keep in mind that prepare_items_for_bulk_import()
is designed solely to
assist in constructing the list of items for submitting a bulk import job. It
operates under certain constraints; refer to the function's documentation for
further details. After obtaining the function's output, you can manually adjust
individual items as needed.
Exports are the actions of exporting your files from the Platform into a
cloud bucket represented as a volume.
Export operations are also related to volumes, but in the API they are
separated under /exports
endpoints, so in our library they are also grouped
under the exports
path on the authentication object (Exports
resource
class).
A single export job is represented as an Export
class object containing
information about a file has been or is being exported, from which
project/folder on the Platform, to which volume, export start and finish time,
status of the job, logs etc.
Users can preview and query all export jobs they've created for the purpose of
exporting their files from the Platform into a cloud bucket using volumes.
The output is a Collection
object storing a list of exports in its items
field and providing pagination options.
# List exports all_exports <- a$exports$query() # Limit results to 5 exp_limit5 <- a$exports$query(limit = 5) # Load next page of 5 results exp_limit5$next_page(advance_access = TRUE) # List all results until last page exp_limit5$all()
It is possible to use some query parameters as different criteria for filtering
results like volume
, state
etc:
# List exports with status RUNNING or FAILED exp_states <- a$exports$query(state = c("RUNNING", "FAILED")) # List exports into a specific volume exp_volume <- a$exports$query( volume = "<volume_owner_or_division>/<volume_name>" # volume object or id )
Listing exports is also available within Volume
objects, where results
contain all files exported to the specific volume they're being called from.
# Get the volume for which you want to list all exports vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>") # List exports vol1$list_exports()
Similar to other resource classes, the get()
method will return a single
export job object when provided with job id.
# Get a single export exp_obj <- auth$exports$get(id = "<export_job_id>")
Users are able to fetch details for multiple export jobs by calling one bulk
action - the bulk_get()
method. The accepted input can be a list of export
job IDs or a list of export job objects (of class Export).
The result will be a Collection object containing a list of (updated)
export jobs.
# Get details of multiple export jobs export_jobs <- a$exports$bulk_get( exports = list("<export_job_id-1>", "<export_job_id-1>") )
In order to export platform files into volumes, users can use the
submit_export()
method from the auth$exports
path, or directly on the
selected File
object (file they want to export) where this function is also
available.
# First, get the volume you want to export files to vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>") # Get the File object/id you want to export from the platform test_file <- a$files$get("<file_id>") # Perform a file export exp_job1 <- a$exports$submit_export( source_file = test_file, destination_volume = vol1, destination_location = "new_volume_file.txt" # new name )
Preview export job details with the print()
method:
# Print export job info print(exp_job1)
Bear in mind that folders export from the platform to volumes is not possible with this function. For such cases (or export of multiple files) it is better to use bulk actions that will be added to the package soon.
Users can also export files into specific volume directories, by providing the
prefix
within the location
parameter as a folder name, which will then be
virtually created on the volume:
# Export file into the folder 'test_folder' exp_job2 <- a$exports$submit_export( source_file = test_file, destination_volume = vol1, destination_location = "test_folder/new_volume_file.txt" # new name ) # Print export job info print(exp_job2)
Important :
access_mode
parameter
to RW when creating or modifying a volume.In order to refresh the export job object and get the up to date info about
its state, you can always call the reload()
function:
# Reload export object exp_job1$reload()
Users are able to perform the bulk action on multiple project files and
export them into a volume using a single call of the bulk_submit_export()
method from the Auth$exports
path.
The required input should be a nested list of elements for each file you
want to export containing specific fields: source_file
, destination_volume
,
destination_location
, overwrite
and
properties
that accepts a list with fields: sse_algorithm
,
sse_aws_kms_key_id
and/or aws_canned_acl
.
## First, get the project and files you want to export test_proj <- a$projects$get("<project_id>") proj_files <- test_proj$list_files() ## Choose the first 3 files to export files_to_export <- proj_files$items[1:3] ## Then, get the volume you want to export files into vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>") ## Construct the inputs list by filling the necessary information for each # file to export to_export <- list( list( source_file = files_to_export[[1]], destination_volume = vol1, destination_location = files_to_export[[1]]$name ), list( source_file = "second-file-id", destination_volume = vol1, destination_location = "my-folder/exported_second_file.txt", overwrite = TRUE ), list( source_file = files_to_export[[3]], destination_volume = vol1, destination_location = files_to_export[[3]]$name, overwrite = FALSE, properties = list( sse_algorithm = "AES256" ) ), copy_only = FALSE ) bulk_export_jobs <- a$exports$bulk_submit_export(items = to_export) # Preview the results bulk_export_jobs # Get updated status by fetching details with bulk_get() and by passing the # list of export jobs created in the previous step a$exports$bulk_get(exports = bulk_export_jobs$items)
As you may see from the example above, users are able to export files into a folder on a volume.
Moreover, you are able to pass the objects of classes Volume or File
(with type = 'file' only, since folders can't be exported) when
constructing the inputs list, in addition to using their string IDs for
source_file
and destination_volume
fields. However, destination_location
on the volume must be set as character field.
Lastly, copy_only
field will apply to all files being exported, which means
that each file would be copied to the volume, while source location would
remain on the Platform if copy_only
is set to TRUE.
In the example above, the list of items for the bulk export job was manually
created. Alternatively, you can use the prepare_items_for_bulk_export()
utility function to generate the items list.
This function allows you to prepare the list of bulk export items based on the
provided File
objects, filling the following fields for each item:
source_file
, destination_volume
, destination_location
, overwrite
,
properties
.
Check the function's documentation and API documentation for more details.
## First, get the project and files you want to export test_proj <- a$projects$get("<project_id>") proj_files <- test_proj$list_files() ## Then, get the volume you want to export files into vol1 <- a$volumes$get(id = "<volume_owner_or_division>/<volume_name>") ## Select two File objects file_1_export <- proj_files[[1]] file_2_export <- proj_files[[2]] files_to_export <- list(file_1_export, file_2_export) ## Construct the inputs list using the prepare_items_for_bulk_export() utility # function to_export <- prepare_items_for_bulk_export( files = files_to_export, destination_volume = vol1, destination_location_prefix = "my-folder/" ) bulk_export_jobs <- a$exports$bulk_submit_export(items = to_export) # Preview the results bulk_export_jobs # Get updated status by fetching details with bulk_get() and by passing the # list of export jobs created in the previous step a$exports$bulk_get(exports = bulk_export_jobs$items)
Keep in mind that prepare_items_for_bulk_export()
is designed solely to
assist in constructing the list of items for submitting a bulk export job. It
operates under certain constraints; refer to the function's documentation for
further details. After obtaining the function's output, you can manually adjust
individual items as needed.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.