Cache: An R6 class for managing a persistent file-based cache

CacheR Documentation

An R6 class for managing a persistent file-based cache

Description

Cache provides an R6 API for managing an on-disk key-value store for R objects. The objects are serialized to a single folder as .rds files and the key of the object equals the name of the file. Cache supports automatic removal of old files if the cache folder exceeds a predetermined number of files, total size, or if the individual files exceed a certain age.

Details

This class is part of the R6 API of rotor which is intended for developers that want to extend this package. For normal usage, the simpler functional API is recommended (see rotate()).

Super class

rotor::DirectoryQueue -> Cache

Public fields

dir

a character scalar. path of the directory in which to store the cache files

n

integer scalar: number of files in the cache

max_files

see the compress argument of base::saveRDS(). Note: this differs from the $compress argument of rotate().

max_files

integer scalar: maximum number of files to keep in the cache

Active bindings

dir

a character scalar. path of the directory in which to store the cache files

n

integer scalar: number of files in the cache

max_files

see the compress argument of base::saveRDS(). Note: this differs from the $compress argument of rotate().

max_files

integer scalar: maximum number of files to keep in the cache

max_size

scalar integer, character or Inf. Delete cached files (starting with the oldest) until the total size of the cache is below max_size. Integers are interpreted as bytes. You can pass character vectors that contain a file size suffix like 1k (kilobytes), 3M (megabytes), 4G (gigabytes), 5T (terabytes). Instead of these short forms you can also be explicit and use the IEC suffixes KiB, MiB, GiB, TiB. In Both cases 1 kilobyte is 1024 bytes, 1 megabyte is 1024 kilobytes, etc... .

max_age
  • a Date scalar: Remove all backups before this date

  • a character scalar representing a Date in ISO format (e.g. "2019-12-31")

  • a character scalar representing an Interval in the form "<number> <interval>" (see rotate())

hashfun

NULL or a function to generate a unique hash from the object to be cached (see example). The hash must be a text string that is a valid filename on the target system. If $hashfun is NULL, a storage key must be supplied manually in cache$push(). If a new object is added with the same key as an existing object, the existing object will be overwritten without warning. All cached files

Methods

Public methods

Inherited methods

Method new()

Usage
Cache$new(
  dir = dirname(file),
  max_files = Inf,
  max_size = Inf,
  max_age = Inf,
  compression = TRUE,
  hashfun = digest::digest,
  create_dir = TRUE
)
Arguments
create_dir

logical scalar. If TRUE dir is created if it does not exist.

Examples
td <- file.path(tempdir(), "cache-test")

# When using a real hash function as hashfun, identical objects will only
# be added to the cache once
cache_hash <- Cache$new(td, hashfun = digest::digest)
cache_hash$push(iris)
cache_hash$push(iris)
cache_hash$files
cache_hash$purge()

# To override this behaviour use a generator for unique ids, such as uuid
if (requireNamespace("uuid")){
  cache_uid <- Cache$new(td, hashfun = function(x) uuid::UUIDgenerate())
  cache_uid$push(iris)
  cache_uid$push(iris)
  cache_uid$files
  cache_uid$purge()
}

unlink(td, recursive = TRUE)

Method push()

push a new object to the cache

Usage
Cache$push(x, key = self$hashfun(x))
Arguments
x

any R object

key

a character scalar. Key under which to store the cached object. Must be a valid filename. Defaults to being generated by $hashfun() but may also be supplied manually.

Returns

a character scalar: the key of the newly added object


Method read()

read a cached file

Usage
Cache$read(key)
Arguments
key

character scalar. key of the cached file to read.


Method remove()

remove a single file from the cache

Usage
Cache$remove(key)
Arguments
key

character scalar. key of the cached file to remove


Method pop()

Read and remove a single file from the cache

Usage
Cache$pop(key)
Arguments
key

character scalar. key of the cached file to read/remove


Method prune()

Prune the cache

Delete cached objects that match certain criteria. max_files and max_size deletes the oldest cached objects first; however, this is dependent on accuracy of the file modification timestamps on your system. For example, ext3 only supports second-accuracy, and some windows version only support timestamps at a resolution of two seconds.

If two files have the same timestamp, they are deleted in the lexical sort order of their key. This means that by using a function that generates lexically sortable keys as hashfun (such as ulid::generate()) you can enforce the correct deletion order. There is no such workaround if you use a real hash function.

Usage
Cache$prune(
  max_files = self$max_files,
  max_size = self$max_size,
  max_age = self$max_age,
  now = Sys.time()
)
Arguments
max_files, max_size, max_age

see section Active Bindings.

now

a POSIXct datetime scalar. The current time (for max_age)


Method purge()

purge the cache (remove all cached files)

Usage
Cache$purge()

Method destroy()

purge the cache (remove all cached files)

Usage
Cache$destroy()

Method print()

Usage
Cache$print()

Method set_max_files()

Usage
Cache$set_max_files(x)

Method set_max_age()

Usage
Cache$set_max_age(x)

Method set_max_size()

Usage
Cache$set_max_size(x)

Method set_compression()

Usage
Cache$set_compression(x)

Method set_hashfun()

Usage
Cache$set_hashfun(x)

See Also

Other R6 Classes: BackupQueueDateTime, BackupQueueDate, BackupQueueIndex, BackupQueue, DirectoryQueue

Examples


## ------------------------------------------------
## Method `Cache$new`
## ------------------------------------------------

td <- file.path(tempdir(), "cache-test")

# When using a real hash function as hashfun, identical objects will only
# be added to the cache once
cache_hash <- Cache$new(td, hashfun = digest::digest)
cache_hash$push(iris)
cache_hash$push(iris)
cache_hash$files
cache_hash$purge()

# To override this behaviour use a generator for unique ids, such as uuid
if (requireNamespace("uuid")){
  cache_uid <- Cache$new(td, hashfun = function(x) uuid::UUIDgenerate())
  cache_uid$push(iris)
  cache_uid$push(iris)
  cache_uid$files
  cache_uid$purge()
}

unlink(td, recursive = TRUE)

s-fleck/rtdr documentation built on Oct. 18, 2022, 12:26 a.m.