cache_rds | R Documentation |
Save the value of an expression to a cache file (of the RDS format). Next time the value is loaded from the file if it exists.
cache_rds(
expr = {
},
rerun = FALSE,
file = "cache.rds",
dir = "cache/",
hash = NULL,
clean = getOption("xfun.cache_rds.clean", TRUE),
...
)
expr |
An R expression. |
rerun |
Whether to delete the RDS file, rerun the expression, and save the result again (i.e., invalidate the cache if it exists). |
file |
The base (see Details) cache filename under the directory
specified by the |
dir |
The path of the RDS file is partially determined by |
hash |
A |
clean |
Whether to clean up the old cache files automatically when
|
... |
Other arguments to be passed to |
Note that the file
argument does not provide the full cache filename. The
actual name of the cache file is of the form ‘BASENAME_HASH.rds’, where
‘BASENAME’ is the base name provided via the ‘file’ argument (e.g.,
if file = 'foo.rds'
, BASENAME
would be ‘foo’), and ‘HASH’ is
the MD5 hash (also called the ‘checksum’) calculated from the R code
provided to the expr
argument and the value of the hash
argument, which
means when the code or the hash
argument changes, the ‘HASH’ string
may also change, and the old cache will be invalidated (if it exists). If you
want to find the cache file, look for ‘.rds’ files that contain 32
hexadecimal digits (consisting of 0-9 and a-z) at the end of the filename.
The possible ways to invalidate the cache are: 1) change the code in expr
argument; 2) delete the cache file manually or automatically through the
argument rerun = TRUE
; and 3) change the value of the hash
argument. The
first two ways should be obvious. For the third way, it makes it possible to
automatically invalidate the cache based on changes in certain R objects. For
example, when you run cache_rds({ x + y })
, you may want to invalidate the
cache to rerun { x + y }
when the value of x
or y
has been changed, and
you can tell cache_rds()
to do so by cache_rds({ x + y }, hash = list(x, y))
. The value of the argument hash
is expected to be a list, but it can
also take a special value, "auto"
, which means cache_rds(expr)
will try
to automatically figure out the global variables in expr
, return a list of
their values, and use this list as the actual value of hash
. This behavior
is most likely to be what you really want: if the code in expr
uses an
external global variable, you may want to invalidate the cache if the value
of the global variable has changed. Here a “global variable” means a
variable not created locally in expr
, e.g., for cache_rds({ x <- 1; x + y })
, x
is a local variable, and y
is (most likely to be) a global
variable, so changes in y
should invalidate the cache. However, you know
your own code the best. If you want to be completely sure when to invalidate
the cache, you can always provide a list of objects explicitly rather than
relying on hash = "auto"
.
By default (the argument clean = TRUE
), old cache files will be
automatically cleaned up. Sometimes you may want to use clean = FALSE
(set
the R global option options(xfun.cache_rds.clean = FALSE)
if you want
FALSE
to be the default). For example, you may not have decided which
version of code to use, and you can keep the cache of both versions with
clean = FALSE
, so when you switch between the two versions of code, it will
still be fast to run the code.
If the cache file does not exist, run the expression and save the result to the file, otherwise read the cache file and return the value.
Changes in the code in the expr
argument do not necessarily always
invalidate the cache, if the changed code is parse
d
to the same
expression as the previous version of the code. For example, if you have
run cache_rds({Sys.sleep(5);1+1})
before, running cache_rds({ Sys.sleep( 5 ) ; 1 + 1 })
will use the cache, because the two expressions are
essentially the same (they only differ in white spaces). Usually you can
add/delete white spaces or comments to your code in expr
without
invalidating the cache. See the package vignette vignette('xfun', package = 'xfun')
for more examples.
When this function is called in a code chunk of a knitr document, you may not want to provide the filename or directory of the cache file, because they have reasonable defaults.
Side-effects (such as plots or printed output) will not be cached. The
cache only stores the last value of the expression in expr
.
cache_exec()
, which is more flexible (e.g., it supports in-memory
caching and different read/write methods for cache files).
f = tempfile() # the cache file
compute = function(...) {
res = xfun::cache_rds({
Sys.sleep(1)
1:10
}, file = f, dir = "", ...)
res
}
compute() # takes one second
compute() # returns 1:10 immediately
compute() # fast again
compute(rerun = TRUE) # one second to rerun
compute()
unlink(paste0(f, "_*.rds"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.