rdpla

knitr::opts_chunk$set(
  comment = "#>",
  collapse = TRUE,
  warning = FALSE,
  message = FALSE,
  fig.path = "inst/img/",
  fig.width = 10,
  fig.cap = ""
)

Build Status codecov rstudio mirror downloads cran version

rdpla: R client for Digital Public Library of America

Digital Public Library of America brings together metadata from libraries, archives, and museums in the US, and makes it freely available via their web portal as well as an API. DPLA's portal and API don't provide the items themselves from contributing institutions, but they provide links to make it easy to find things. The kinds of things DPLA holds metadata for include images of works held in museums, photographs from various photographic collections, texts, sounds, and moving images.

DPLA has a great API with good documentation - a rare thing in this world. Further documentation on their API can be found on their search fields and examples of queries. Metadata schema information here.

DPLA data data can be used for a variety of use cases in various academic and non-academic fields. Here are some examples (vignettes to come soon showing examples):

DPLA API has two main services (quoting from their API docs):

rdpla also has an interface (dpla_bulk) to download bulk and compressed JSON data.

Note that you can only run examples/vignette/tests if you have an API key. See ?dpla_get_key to get an API key.

Tutorials

There are two vignettes. After installation check them out. If installing from GitHub, do devtools::install_github("ropensci/rdpla", build_vignettes = TRUE)

Installation

Stable version from CRAN

install.packages("rdpla")

Dev version from GitHub:

install.packages("devtools")
devtools::install_github("ropensci/rdpla")
library('rdpla')

Authentication

You need an API key to use the DPLA API. Use dpla_get_key() to request a key, which will then be emailed to you. Pass in the key in the key parameter in functions in this package or you can store the key in your .Renviron as DPLA_API_KEY or in your .Rprofile file under the name dpla_api_key.

Search - items

Note: limiting fields returned for readme brevity.

Basic search

dpla_items(q="fruit", page_size=5, fields=c("provider","creator"))

Limit fields returned

dpla_items(q="fruit", page_size = 10, fields=c("publisher","format"))

Limit records returned

dpla_items(q="fruit", page_size=2, fields=c("provider","title"))

Search by date

dpla_items(q="science", date_before=1900, page_size=10, fields=c("id","date"))

Search on specific fields

dpla_items(description="obituaries", page_size=2, fields="description")
dpla_items(subject="yodeling", page_size=2, fields="subject")
dpla_items(provider="HathiTrust", page_size=2, fields="provider")

Spatial search, across all spatial fields

dpla_items(sp='Boston', page_size=2, fields=c("id","provider"))

Spatial search, by states

dpla_items(sp_state='Massachusetts OR Hawaii', page_size=2, fields=c("id","provider"))

Faceted search

dpla_items(facets=c("sourceResource.spatial.state","sourceResource.spatial.country"),
      page_size=0, facet_size=5)

Search - collections

Search for collections with the words university of texas

dpla_collections(q="university of texas", page_size=2)

You can also search in the title and description fields

dpla_collections(description="east")

Visualize

Visualize metadata from the DPLA - histogram of number of records per state (includes states outside the US)

out <- dpla_items(facets="sourceResource.spatial.state", page_size=0, facet_size=25)
library("ggplot2")
library("scales")
ggplot(out$facets$sourceResource.spatial.state$data, aes(reorder(term, count), count)) +
  geom_bar(stat="identity") +
  coord_flip() +
  theme_grey(base_size = 16) +
  scale_y_continuous(labels = comma) +
  labs(x="State", y="Records")

Meta

ropensci



ropensci/rdpla documentation built on May 18, 2022, 6:32 p.m.