View source: R/hathi-ef-tools.R
get_hathi_counts | R Documentation |
Given a single Hathi Trust ID, this function returns a
tibble with its per-page word count and part of speech
information, and caches the results to the getOption("hathiTools.ef.dir")
directory (by default "./hathi-ef"). If the file has not been cached already,
it first attempts to download it directly from the Hathi Trust server. This
function uses code authored by Ben Schmidt, from his Hathidy package
(https://github.com/HumanitiesDataAnalysis/hathidy).
get_hathi_counts(
htid,
dir = getOption("hathiTools.ef.dir"),
cache_format = getOption("hathiTools.cacheformat")
)
htid |
The Hathi Trust id of the item whose extracted features files are to be loaded into memory. If it hasn't been downloaded, the function will try to download it first. |
dir |
The directory where the download extracted features files are to
be found. Defaults to |
cache_format |
File format of cache for Extracted Features files.
Defaults to |
a tibble with the extracted features.
Ben Schmidt
# Download the 1863 version of "Democracy in America" by Tocqueville and get
# its extracted features
tmp <- tempdir()
get_hathi_counts("aeu.ark:/13960/t3qv43c3w", dir = tmp)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.