knitr::opts_chunk$set(
  warning = FALSE,
  message = FALSE,
  collapse = TRUE,
  comment = "#>",
  cache.path = "inst/cache/"
)

oai: General Purpose 'Oai-PMH' Services Client

Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-check cran checks codecov.io rstudio mirror downloads cran version

oai is an R client to work with OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) services, a protocol developed by the Open Archives Initiative (https://en.wikipedia.org/wiki/Open_Archives_Initiative). OAI-PMH uses XML data format transported over HTTP.

OAI-PMH Info:

oai is built on xml2 and httr. In addition, we give back data.frame's whenever possible to make data comprehension, manipulation, and visualization easier. We also have functions to fetch a large directory of OAI-PMH services - it isn't exhaustive, but does contain a lot.

OAI-PMH instead of paging with e.g., page and per_page parameters, uses (optionally) resumptionTokens, optionally with an expiration date. These tokens can be used to continue on to the next chunk of data, if the first request did not get to the end. Often, OAI-PMH services limit each request to 50 records, but this may vary by provider, I don't know for sure. The API of this package is such that we while loop for you internally until we get all records. We may in the future expose e.g., a limit parameter so you can say how many records you want, but we haven't done this yet.

Install

Install from CRAN

install.packages("oai")

Development version

devtools::install_github("ropensci/oai")
library("oai")

Identify

id("http://oai.datacite.org/oai")

ListIdentifiers

list_identifiers(from = '2018-05-01T', until = '2018-06-01T')

Count Identifiers

count_identifiers()

ListRecords

list_records(from = '2018-05-01T', until = '2018-05-15T')

GetRecords

ids <- c("87832186-00ea-44dd-a6bf-c2896c4d09b4", "d981c07d-bc43-40a2-be1f-e786e25106ac")
get_records(ids)

List MetadataFormats

list_metadataformats(id = "87832186-00ea-44dd-a6bf-c2896c4d09b4")

List Sets

list_sets("http://api.gbif.org/v1/oai-pmh/registry")

Examples of other OAI providers

Biodiversity Heritage Library

Identify

id("http://www.biodiversitylibrary.org/oai")

Get records

get_records(c("oai:biodiversitylibrary.org:item/7", "oai:biodiversitylibrary.org:item/9"),
            url = "http://www.biodiversitylibrary.org/oai")

Acknowledgements

Michał Bojanowski thanks National Science Centre for support through grant 2012/07/D/HS6/01971.

Meta

library(hexSticker)
library(magick)
download.file(
  "https://wiki.epc.ub.uu.se/download/thumbnails/34212946/images.png?version=1&modificationDate=1511952524492&api=v2",
  dest = fig <- tempfile()
)

image_read(fig) |>
  image_crop(geometry = "x90%") |>
  image_write(path = fig2 <- tempfile())

sticker(
  fig2,
  package = "oai",
  url = "https://docs.ropensci.org/oai/",
  s_x = 1,
  s_y = 0.83, 
  s_width = 0.7,
  p_size = 25,
  p_y = 1.5,
  u_color = "#ffffff",
  u_size = 5,
  h_fill = "#0C0F2E",
  h_color = "black",
  filename = "man/figures/logo.png"
)


ropensci/oai documentation built on Nov. 18, 2022, 5:33 p.m.