Description Usage Arguments Details Value Examples
build_corpus
downloads the OCR text versions of works found by searching the Internet Archive's metadata for the specified ‘keywords' over a given 'date_range' (provided in the format "yyyy TO yyyy"), and it returns a dataframe that includes the Internet Archive’s metadata about the retrieved works along with the location of the corresponding text files.
1 2 3 4 5 6 7 | build_corpus(
keywords,
date_range = "1700 TO 1899",
download_dir = "data-raw/corpus",
max_results = 10000,
chime = TRUE
)
|
keywords |
A vector of keywords to search in the metadata of the Internet Archive's text collection |
date_range |
The desired data range to search, specified in the format "yyyy TO yyyy" |
download_dir |
The directory (relative to your working directory) to which files from the Internet Archive will be downloaded. |
max_results |
The maximum number of text results |
chime |
Should the function chime on completion? |
Details needed
A dataframe representing the corpus of downloaded texts
1 2 3 4 | ## Not run:
yf_corpus <- build_corpus(keywords = "yellow fever")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.