syncToSynapse: syncToSynapse

syncToSynapseR Documentation

syncToSynapse

Description

Synchronizes files specified in the manifest file to Synapse.

Given a file describing all of the uploads, this uploads the content to Synapse and optionally notifies you via Synapse messagging (email) at specific intervals, on errors and on completion.

Usage

syncToSynapse(manifestFile, dryRun=FALSE, sendMessages=TRUE, retries=4, merge_existing_annotations=TRUE, associate_activity_to_new_version=FALSE)

Arguments

manifestFile

A tsv file with file locations and metadata
to be pushed to Synapse. See Details for manifest file format.

dryRun

Performs validation without uploading if set to True. Defaults to FALSE.)

sendMessages

Sends out messages on completion if set to True. Defaults to TRUE.

retries

Number of retries to attempt if an error occurs. Defaults to 4.

merge_existing_annotations

If True, will merge the annotations in the manifest file with the existing annotations on Synapse. If False, will overwrite the existing annotations on Synapse with the annotations in the manifest file. Defaults to TRUE.

associate_activity_to_new_version

If True, and a version update occurs, the existing activity in Synapse will be associated with the new version. The exception is if you are specifying new values to be used/executed, it will create a new activity for the new version of the entity. Defaults to FALSE.

Details

Manifest File Format:

The format of the manifest file is a tab delimited file with one row per file to upload and columns describing the file. The minimum required columns are path and parent where path is the local file path and parent is the synapse Id of the project or folder where the file is uploaded to. In addition to these columns you can specify any of the parameters to the File constructor (name, synapseStore, contentType) as well as parameters to the syn.store command (used, executed, activityName, activityDescription, forceVersion). Used and executed can be semi-colon (”;”) separated lists of Synapse ids, urls and/or local filepaths of files already stored in Synapse (or being stored in Synapse by the manifest). Any additional columns will be added as annotations.

Required fields:

Field Meaning Example
path local file path or URL /path/to/local/file.txt
parent synapse id syn1235

Common fields:

Field Meaning Example
name name of file in Synapse Example_file
forceVersion whether to update version False

Provenance fields:

Field Meaning Example
used List of items used to generate file syn1235; /path/to_local/file.txt
executed List of items exectued https://github.org/; /path/to_local/code.py
activityName Name of activity in provenance “Ran normalization”
activityDescription Text description on what was done “Ran algorithm xyx with parameters...”

Annotations:

Any columns that are not in the reserved names described above will be intepreted as annotations of the file.

Other optional fields:

Field Meaning Example
synapseStore Boolean describing whether to upload files True
contentType content type of file to overload defaults text/html

Example manifest file:

path parent annot1 annot2 used executed
/path/file1.txt syn1243 “bar” 3.1415 “syn124; /path/file2.txt” “https://github.org/foo/bar“
/path/file2.txt syn12433 “baz” 2.71 “” “https://github.org/foo/baz“

Examples

  ## Not run: 
    # synchronizes files
    syncToSynapse("/path/to/manifest.tsv")
    
    # add an annotation
    df <- read.table('/path/to/manifest.tsv', header = T)
    df["species"] = "Homo sapiens"
    write.table(df, file='/path/to/manifest.tsv', sep='\t', row.names = FALSE)
    syncToSynapse('/path/to/manifest.tsv')
    
    # update the annotation
    df["species"] = "Human"
    write.table(df, file='/path/to/manifest.tsv', sep='\t', row.names = FALSE)
    syncToSynapse('/path/to/manifest.tsv', merge_existing_annotations = TRUE)
    
    # overwrite the annotation
    df["species"] = "Homo sapiens"
    write.table(df, file='/path/to/manifest.tsv', sep='\t', row.names = FALSE)
    syncToSynapse('/path/to/manifest.tsv', merge_existing_annotations = FALSE)
    
    # create an Activity/Provenance for the first files
    # create a relationship between two files. 
    # In this example, the second file in the manifest is used to generate the first file
    df[1, "used"] = df[2,]$path 
    
    # link to the tool that is used to produce the results in the first file
    df[1, "executed"] = "https://nf-co.re/rnaseq/3.14.0"
    
    # add a description for this Activity/Provenance
    df[1, "activityDescription"] = "Experiment results created as a result of the linked data while running the pipeline."
    write.table(df, file='/path/to/manifest.tsv', sep='\t', row.names = FALSE)
    syncToSynapse('/path/to/manifest.tsv')
    
    # associate Activity/Provenance with the newer version 
    # after changing the content or metadata of the file
    syncToSynapse('/path/to/manifest.tsv', associate_activity_to_new_version=TRUE)
  
## End(Not run)

Sage-Bionetworks/synapserutils documentation built on Aug. 31, 2024, 10:42 a.m.