clear_cache: Removes cached files for a set of Hathi Trust ids

View source: R/cache_tools.R

clear_cacheR Documentation

Removes cached files for a set of Hathi Trust ids

Description

Removes cached files for a set of Hathi Trust ids

Usage

clear_cache(
  htids,
  dir = getOption("hathiTools.ef.dir"),
  cache_type = c("ef", "meta", "pagemeta"),
  cache_format = c("csv.gz", "rds", "feather", "text2vec.csv", "parquet"),
  keep_json = TRUE
)

Arguments

htids

A character vector of Hathi Trust ids, a workset created with workset_builder, or a data frame with a column named "htid" containing the Hathi Trust ids that require caching. If the JSON Extracted Features files for these htids have not been downloaded via rsync_from_hathi or get_hathi_counts to dir, nothing will be cached (unless attempt_rsync is TRUE).

dir

The directory where the download extracted features files are to be found. Defaults to getOption("hathiTools.ef.dir"), which is just "hathi-ef" on load.

cache_type

Type of information to remove. The default is c("ef", "meta", "pagemeta"), which refers to the extracted features, the volume metadata, and the page metadata in dir. Omitting one of these removes only them (e.g., cache_type = "ef" removes only the EF files, not their associated metadata or page metadata).

cache_format

The format of the cached EF files to remove. Defaults to c("csv.gz", "rds", "feather", "text2vec.csv", "parquet"), i.e., all formats.

keep_json

Whether to keep any downloaded JSON files. Default is TRUE; if FALSE will delete all JSON extracted features associated with the set of htids.

Value

(Invisible) a character vector with the deleted paths.

Note

Warning! This function does not double-check that you want to delete your cache. It will go ahead and do it.

Examples


dir <- tempdir()

htids <- c("mdp.39015008706338", "mdp.39015058109706")
dir <- tempdir()

cache_htids(htids, dir = dir, cache_type = "ef", attempt_rsync = TRUE)

# Clears only "csv" cache

deleted <- clear_cache(htids, dir = dir)
deleted

# Clears also JSON files

deleted <- clear_cache(htids, dir = dir, keep_json = FALSE)
deleted



xmarquez/hathiTools documentation built on June 2, 2025, 5:12 a.m.