knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

Build Status

sul.wp

The goal of sul.wp is to help Stanford researchers access the Library's collection of Washington Post Full-Text Archives through R

Installation

install.packages("devtools")
devtools::install_github("wrathofquan/sul.wp")

Authenticate

First log in to redivis.com with your SUNet ID. Then create your API token:

library(sul.wp)

## Authenticate with Redivis
## More on Redivis API: https://apidocs.redivis.com/authorization

# redivis_auth("your-api-token")

Retrieve Articles by Single Year

## Get an entire year of articles

df_1977 <- get_articles_year("1977")

head(df_1977)

dim(df_1977)

Retrieve Articles by Multiple Years

## To get multiple years of articles, you can use purrr, apply, or a 'for' loop.
## Example using purrr:

years <- c("1977", "1980")

df_twoYears <- purrr::map_dfr(years, get_articles_year) 

dim(df_twoYears)

Search Corpus by Keyword

## Search title and paragraphs of articles using case-insensitive keyword, restrict by year, remove <html> formatting from articles
## Note that some articles are assets like embedded videos or image slide-shows
## use of strip_html will likely return empty strings for these types of content. 

df_blm <- search_articles(query = "Black Lives Matter", year = "2016", strip_html = TRUE)

head(df_blm)


wrathofquan/sul.wp documentation built on Feb. 12, 2024, 6:38 a.m.