publish_dataset_edi: Publish a dataset at the EDI repository
In jornada-im/jerald: Publish Jornada datasets with ease

publish_dataset_edi

R Documentation

Publish a dataset at the EDI repository

Description

This function publishes a dataset (or "data package") at the EDI research data repository using metadata derived from an LTER Metabase. The user must supply credentials for the metabase and EDI (see load_metabase_cred and load_destination_cred functions), and appropriate database and EDI environment names.

Usage

publish_dataset_edi(
  datasetid,
  mb.name,
  mb.cred,
  edi.cred,
  edi.env = "staging",
  dry.run = TRUE,
  s3.upload = TRUE,
  multi.part = FALSE,
  skip_checks = FALSE,
  bucket.name = Sys.getenv("AWS_S3_BUCKETNAME")
)

Arguments

`datasetid`	ID number of the dataset to find in metabase and update in EDI
`mb.name`	name of the metabase database in the postgres cluster
`mb.cred`	list of credentials for the metabase postgres cluster
`edi.cred`	list of credentials to use for EDI
`edi.env`	name of the EDI environment to update (staging, production, or development)
`dry.run`	boolean value - write EML only, then stop (end before s3 and EDI upload) if TRUE, continue to publish if FALSE
`s3.upload`	boolean value (T/F) if TRUE upload to the s3 bucket, if FALSE skip this (entities already there). Note that this does not currently do a check on whether entities are present or not.
`skip_checks`	boolean value (T/F) indicating whether or not to check for congruence between data entity and attribute metadata (check_attribute_congruence function). May want to set as True if the data are online and not in the working directory.
`bucket.name`	name of the s3 bucket to push data entities to

Details

The basic process is to

Pull metadata for the dataset from the metabase using MetaEgress
Query for the current dataset revision in the EDI environment
Write an EML document for the next revision to go to EDI
Push EML data entities from the working directory to an s3 bucket
Push the EML document to EDI, which triggers PASTA to pull data from the s3 bucket and update the data package.