create_dataset_edi: Create a dataset at the EDI repository

View source: R/user.R

create_dataset_ediR Documentation

Create a dataset at the EDI repository

Description

This function creates a new dataset (or "data package") at the EDI research data repository using metadata derived from an LTER Metabase. The user must supply credentials for the metabase and EDI (see load_metabase_cred and load_destination_cred functions), and appropriate database and EDI environment names.

Usage

create_dataset_edi(
  datasetid,
  mb.name,
  mb.cred,
  edi.cred,
  edi.env = "staging",
  publish = FALSE,
  bucket.name = Sys.getenv("AWS_S3_BUCKETNAME")
)

Arguments

datasetid

ID number of the dataset to find in metabase and update in EDI

mb.name

name of the metabase database in the postgres cluster

mb.cred

list of credentials for the metabase postgres cluster

edi.cred

list of credentials to use for EDI

edi.env

name of the EDI environment to update (staging, production, or development)

publish

boolean value - publish if TRUE, end before s3 upload if FALSE

bucket.name

name of the s3 bucket to push data entities to

Details

The basic process is to

  1. Pull metadata for the dataset from the metabase using MetaEgress

  2. Query for the current dataset revision in the EDI environment

  3. Write an EML document for the next revision to go to EDI

  4. Push EML data entities from the working directory to an s3 bucket

  5. Push the EML document to EDI, which triggers PASTA to pull data from the s3 bucket and update the data package.


jornada-im/jerald documentation built on Jan. 29, 2025, 11:15 p.m.