knitr::opts_chunk$set( comment = "#>", collapse = TRUE, warning = FALSE, message = FALSE, fig.path = "figure/" )
rdpla
: R client for Digital Public Library of America
Digital Public Library of America brings together metadata from libraries, archives, and museums in the US, and makes it freely available via their web portal as well as an API. DPLA's portal and API don't provide the items themselves from contributing institutions, but they provide links to make it easy to find things. The kinds of things DPLA holds metadata for include images of works held in museums, photographs from various photographic collections, texts, sounds, and moving images.
DPLA has a great API with good documentation - a rare thing in this world. Further documentation on their API can be found on their search fields and examples of queries. Metadata schema information here.
DPLA API has two main services (quoting from their API docs):
dpla_items()
function.dpla_collections()
function.rdpla
also has an interface (dpla_bulk
) to download bulk and compressed JSON data.
Note that you can only run examples/vignette/tests if you have an API key. See below for an example of how to get an API key.
Install from CRAN
install.packages("rdpla")
Development version
if (!requireNamespace("devtools")) { install.packages("devtools") } devtools::install_github("ropensci/rdpla")
Load rdpla
library("rdpla")
If you already have a DPLA API key, make sure it's in your .Renviron
or .Rprofile
file.
If you don't have a DPLA API key, use the dpla_get_key()
function to get a key.
You only need a valid email address to get a key, for example:
dpla_get_key(email = "foo@bar.com") #> API key created and sent via email. Be sure to check your Spam folder, too.
Note: limiting fields returned for readme brevity.
Basic search
dpla_items(q="fruit", page_size=5, fields=c("provider","creator"))
Limit fields returned
dpla_items(q="fruit", page_size = 10, fields=c("publisher","format"))
Limit records returned
dpla_items(q="fruit", page_size=2, fields=c("provider","title"))
Search by date
dpla_items(q="science", date_before=1900, page_size=10, fields=c("id","date"))
Search on specific fields
dpla_items(description="obituaries", page_size=2, fields="description")
dpla_items(subject="yodeling", page_size=2, fields="subject")
dpla_items(provider="HathiTrust", page_size=2, fields="provider")
Spatial search, across all spatial fields
dpla_items(sp='Boston', page_size=2, fields=c("id","provider"))
Spatial search, by states
dpla_items(sp_state='Massachusetts OR Hawaii', page_size=2, fields=c("id","provider"))
Faceted search
dpla_items(facets=c("sourceResource.spatial.state","sourceResource.spatial.country"), page_size=0, facet_size=5)
Search for collections with the words university of texas
dpla_collections(q="university of texas", page_size=2)
You can also search in the title
and description
fields
dpla_collections(description="east")
Visualize metadata from the DPLA - histogram of number of records per state (includes states outside the US)
out <- dpla_items(facets="sourceResource.spatial.state", page_size=0, facet_size=25) library("ggplot2") library("scales") ggplot(out$facets$sourceResource.spatial.state$data, aes(reorder(term, count), count)) + geom_bar(stat="identity") + coord_flip() + theme_grey(base_size = 16) + scale_y_continuous(labels = comma) + labs(x="State", y="Records")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.