build_corpus: Build a Corpus of Works from the Internet Archive
In mariolaespinosa/historicalnetworks: Mapping Historical Citation Networks

Description Usage Arguments Details Value Examples

View source: R/build_corpus.R

build_corpus downloads the OCR text versions of works found by searching the Internet Archive's metadata for the specified ‘keywords' over a given 'date_range' (provided in the format "yyyy TO yyyy"), and it returns a dataframe that includes the Internet Archive’s metadata about the retrieved works along with the location of the corresponding text files.

build_corpus(
  keywords,
  date_range = "1700 TO 1899",
  download_dir = "data-raw/corpus",
  max_results = 10000,
  chime = TRUE
)

`keywords`	A vector of keywords to search in the metadata of the Internet Archive's text collection
`date_range`	The desired data range to search, specified in the format "yyyy TO yyyy"
`download_dir`	The directory (relative to your working directory) to which files from the Internet Archive will be downloaded.
`max_results`	The maximum number of text results
`chime`	Should the function chime on completion?

Details needed

A dataframe representing the corpus of downloaded texts

## Not run: 
 yf_corpus <- build_corpus(keywords = "yellow fever")

## End(Not run)

mariolaespinosa/historicalnetworks documentation built on Feb. 9, 2022, 12:31 p.m.

mariolaespinosa/historicalnetworks index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mariolaespinosa/historicalnetworks
Mapping Historical Citation Networks

build_corpus: Build a Corpus of Works from the Internet Archive
In mariolaespinosa/historicalnetworks: Mapping Historical Citation Networks

Description

Usage

Arguments

Details

Value

Examples

Related to build_corpus in mariolaespinosa/historicalnetworks...

R Package Documentation

Browse R Packages

We want your feedback!

mariolaespinosa/historicalnetworks Mapping Historical Citation Networks

build_corpus: Build a Corpus of Works from the Internet Archive In mariolaespinosa/historicalnetworks: Mapping Historical Citation Networks

Description

Usage

Arguments

Details

Value

Examples

Related to build_corpus in mariolaespinosa/historicalnetworks...

R Package Documentation

Browse R Packages

We want your feedback!

mariolaespinosa/historicalnetworks
Mapping Historical Citation Networks

build_corpus: Build a Corpus of Works from the Internet Archive
In mariolaespinosa/historicalnetworks: Mapping Historical Citation Networks