.robustDigest | R Documentation |
Not all aspects of R objects are captured by current hashing tools in R
(e.g. digest::digest
, knitr
caching, archivist::cache
).
This is mostly because many objects have "transient"
(e.g., functions have environments), or "disk-backed" features.
Since the goal of using reproducibility is to have tools that are not session specific,
this function attempts to strip all session specific information so that the digest
works between sessions and operating systems.
It is tested under many conditions and object types, there are bound to be others that don't
work correctly.
.robustDigest(
object,
.objects = NULL,
length = getOption("reproducible.length", Inf),
algo = "xxhash64",
quick = getOption("reproducible.quick", FALSE),
classOptions = list(),
...
)
## S4 method for signature 'ANY'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'function'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'expression'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'language'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'character'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'Path'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'environment'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'list'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'data.frame'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'numeric'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'matrix'
.robustDigest(object, .objects, length, algo, quick, classOptions)
## S4 method for signature 'integer'
.robustDigest(object, .objects, length, algo, quick, classOptions)
object |
an object to digest. |
.objects |
Character vector of objects to be digested. This is only applicable if there is a list, environment (or similar) with named objects within it. Only this/these objects will be considered for caching, i.e., only use a subset of the list, environment or similar objects. In the case of nested list-type objects, this will only be applied outermost first. |
length |
Numeric. If the element passed to Cache is a |
algo |
The algorithms to be used; currently available choices are
|
quick |
Logical or character. If |
classOptions |
Optional list. This will pass into |
... |
Arguments passed to |
objects |
Optional character vector indicating which objects are to
be considered while making digestible. This argument is not used
in the default cases; the only known method that uses this
in the default cases; the only known method that uses this
argument is the |
A hash i.e., digest of the object passed in.
Raster*
objects have the potential for disk-backed storage, thus, require more work.
Also, because Raster*
can have a built-in representation for having their data content
located on disk, this format will be maintained if the raster already is file-backed,
i.e., to create .tif
or .grd
backed rasters, use writeRaster
first,
then Cache
.
The ‘.tif’ or ‘.grd’ will be copied to the ‘raster/’ subdirectory of the
cachePath
.
Their RAM representation (as an R object) will still be in the usual ‘cacheOutputs/’
(or formerly ‘gallery/’) directory.
For inMemory
raster objects, they will remain as binary .RData
files.
Functions (which are contained within environments) are
converted to a text representation via a call to format(FUN)
.
Objects contained within a list or environment are recursively hashed
using digest::digest()
, while removing all references to
environments.
Character strings are first assessed with dir.exists
and file.exists
to check for paths. If they are found to be paths, then the path is hashed with
only its filename via basename(filename)
. If it is actually a path, we suggest
using asPath(thePath)
Eliot McIntire
a <- 2
tmpfile1 <- tempfile()
tmpfile2 <- tempfile()
tmpfile3 <- tempfile(fileext = ".grd")
tmpfile4 <- tempfile(fileext = ".grd")
save(a, file = tmpfile1)
save(a, file = tmpfile2)
# treats as character string, so 2 filenames are different
digest::digest(tmpfile1)
digest::digest(tmpfile2)
# tests to see whether character string is representing a file
.robustDigest(tmpfile1)
.robustDigest(tmpfile2) # same
# if you tell it that it is a path, then you can decide if you want it to be
# treated as a character string or as a file path
.robustDigest(asPath(tmpfile1), quick = TRUE)
.robustDigest(asPath(tmpfile2), quick = TRUE) # different because using file info
.robustDigest(asPath(tmpfile1), quick = FALSE)
.robustDigest(asPath(tmpfile2), quick = FALSE) # same because using file content
# SpatRasters are have pointers
if (requireNamespace("terra", quietly = TRUE)) {
r <- terra::rast(system.file("ex/elev.tif", package = "terra"))
r3 <- terra::deepcopy(r)
r1 <- terra::writeRaster(r, filename = tmpfile3)
digest::digest(r)
digest::digest(r3) # different but should be same
.robustDigest(r1)
.robustDigest(r3) # same... data & metadata are the same
# note, this is not true for comparing memory and file-backed rasters
.robustDigest(r)
.robustDigest(r1) # different
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.