cdx

Query Web Archive Crawl Indexes ('CDX')

Description

Methods are provided to retrieve web archive crawl index ('CDX') metadata and directly query the 'CDX' 'API' endpoint to retrieve mementos for a given set of parameters.

What's Inside The Tin

The following functions are implemented:

Installation

devtools::install_github("hrbrmstr/cdx")
options(width=120)

Usage

library(cdx)
library(tidyverse)

# current verison
packageVersion("cdx")

Example

cidx <- fetch_collections_index()

rprj <- cdx_query(cidx$cdx_api[1], "*.r-project.org")

rprj


hrbrmstr/cdx documentation built on May 24, 2019, 2:46 p.m.