README.md

Travis-CI Build Status Coverage Status

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Datanodes

An R package caching the result of an expression on disk. It is intended for the incremental redaction of analysis code in R scripts, R Markdown or knitr documents.

It is in essence similar to the caching functionality of knitr, but available from the console; i.e., without having to knit the document to profit from the cache. See the cache section of the knitr options documentation for more information about this functionality in knitr.

Installation

if (!require("devtools")) install.packages("devtools")
devtools::install_github('jullybobble/datanodes@master')

Features

Planned Features

Most of the planned features above require the storage of some metadata.

Usage

library(datanodes)

model_cache <- tempfile()
model <- datanode(model_cache, { 
  # a potentially expensive operation
  # for this example, we choose a not so expensive one...
  lm(formula = mpg ~ wt, data = mtcars)
})

During the first execution of the code above, the expression passed as an argument to the datanode function will be evaluated and cached in the file model_cache. Further executions of the code will read the value from the cache and assign it to the model variable without evaluating the expression lm(formula = mpg ~ wt, data = mtcars); which would gain time if evaluating the expression takes longer than reading the cached value from file.

In the following we define a dependency on the result cached above in the file model_cache by passing is to the depends_on paramameter:

response_cache <- tempfile()
response <- datanode(response_cache,
                     depends_on = model_cache, {
  # another potentially expensive operation
  # again, for the example this is not so expensive
  predict(model, data.frame(wt = 1:50))
})

The first execution of this code snippet will trigger the evaluation of the expression given as argument since the file response_cache does not exist.

Further in the development of our code, we decide to add an independent variable to our model. We thus edit the formula describing the model in the first code example above as in the following, setting the argument force to TRUE to force the evaluation of the expression, without which the cached value would be read from the file model_cache.

model <- datanode(model_cache, force = T, { 
  # same formula as before with the additional hp indenpendent variable
  lm(formula = mpg ~ wt + hp, data = mtcars)
})

After the update of the model cache, a further execution of the repsonse code in the second block above would trigger a re-evaluation of the expression.



jullybobble/datanodes documentation built on May 20, 2019, 4:23 a.m.