create_dataset_edi: Create a dataset at the EDI repository
In jornada-im/jerald: Publish Jornada datasets with ease

create_dataset_edi

R Documentation

Create a dataset at the EDI repository

Description

This function creates a new dataset (or "data package") at the EDI research data repository using metadata derived from an LTER Metabase. The user must supply credentials for the metabase and EDI (see load_metabase_cred and load_destination_cred functions), and appropriate database and EDI environment names.

Usage

create_dataset_edi(
  datasetid,
  mb.name,
  mb.cred,
  edi.cred,
  edi.env = "staging",
  publish = FALSE,
  bucket.name = Sys.getenv("AWS_S3_BUCKETNAME")
)

Arguments

`datasetid`	ID number of the dataset to find in metabase and update in EDI
`mb.name`	name of the metabase database in the postgres cluster
`mb.cred`	list of credentials for the metabase postgres cluster
`edi.cred`	list of credentials to use for EDI
`edi.env`	name of the EDI environment to update (staging, production, or development)
`publish`	boolean value - publish if TRUE, end before s3 upload if FALSE
`bucket.name`	name of the s3 bucket to push data entities to

Details

The basic process is to

Pull metadata for the dataset from the metabase using MetaEgress
Query for the current dataset revision in the EDI environment
Write an EML document for the next revision to go to EDI
Push EML data entities from the working directory to an s3 bucket
Push the EML document to EDI, which triggers PASTA to pull data from the s3 bucket and update the data package.