publish_dataset_edi: Publish a dataset at the EDI repository

View source: R/user.R

publish_dataset_ediR Documentation

Publish a dataset at the EDI repository

Description

This function publishes a dataset (or "data package") at the EDI research data repository using metadata derived from an LTER Metabase. The user must supply credentials for the metabase and EDI (see load_metabase_cred and load_destination_cred functions), and appropriate database and EDI environment names.

Usage

publish_dataset_edi(
  datasetid,
  mb.name,
  mb.cred,
  edi.cred,
  edi.env = "staging",
  dry.run = TRUE,
  s3.upload = TRUE,
  multi.part = FALSE,
  skip_checks = FALSE,
  bucket.name = Sys.getenv("AWS_S3_BUCKETNAME")
)

Arguments

datasetid

ID number of the dataset to find in metabase and update in EDI

mb.name

name of the metabase database in the postgres cluster

mb.cred

list of credentials for the metabase postgres cluster

edi.cred

list of credentials to use for EDI

edi.env

name of the EDI environment to update (staging, production, or development)

dry.run

boolean value - write EML only, then stop (end before s3 and EDI upload) if TRUE, continue to publish if FALSE

s3.upload

boolean value (T/F) if TRUE upload to the s3 bucket, if FALSE skip this (entities already there). Note that this does not currently do a check on whether entities are present or not.

skip_checks

boolean value (T/F) indicating whether or not to check for congruence between data entity and attribute metadata (check_attribute_congruence function). May want to set as True if the data are online and not in the working directory.

bucket.name

name of the s3 bucket to push data entities to

Details

The basic process is to

  1. Pull metadata for the dataset from the metabase using MetaEgress

  2. Query for the current dataset revision in the EDI environment

  3. Write an EML document for the next revision to go to EDI

  4. Push EML data entities from the working directory to an s3 bucket

  5. Push the EML document to EDI, which triggers PASTA to pull data from the s3 bucket and update the data package.


jornada-im/jerald documentation built on Jan. 29, 2025, 11:15 p.m.