drake_config | R Documentation |
Call this function inside the _drake.R
script for r_make()
and friends.
All non-deprecated function arguments are the same
between make()
and drake_config()
.
drake_config(
plan,
targets = NULL,
envir = parent.frame(),
verbose = 1L,
hook = NULL,
cache = drake::drake_cache(),
fetch_cache = NULL,
parallelism = "loop",
jobs = 1L,
jobs_preprocess = 1L,
packages = rev(.packages()),
lib_loc = NULL,
prework = character(0),
prepend = NULL,
command = NULL,
args = NULL,
recipe_command = NULL,
timeout = NULL,
cpu = Inf,
elapsed = Inf,
retries = 0,
force = FALSE,
log_progress = TRUE,
graph = NULL,
trigger = drake::trigger(),
skip_targets = FALSE,
skip_imports = FALSE,
skip_safety_checks = FALSE,
lazy_load = "eager",
session_info = NULL,
cache_log_file = NULL,
seed = NULL,
caching = c("main", "master", "worker"),
keep_going = FALSE,
session = NULL,
pruning_strategy = NULL,
makefile_path = NULL,
console_log_file = NULL,
ensure_workers = NULL,
garbage_collection = FALSE,
template = list(),
sleep = function(i) 0.01,
hasty_build = NULL,
memory_strategy = "speed",
spec = NULL,
layout = NULL,
lock_envir = NULL,
history = TRUE,
recover = FALSE,
recoverable = TRUE,
curl_handles = list(),
max_expand = NULL,
log_build_times = TRUE,
format = NULL,
lock_cache = TRUE,
log_make = NULL,
log_worker = FALSE
)
plan |
Workflow plan data frame.
A workflow plan data frame is a data frame
with a |
targets |
Character vector, names of targets to build. Dependencies are built too. You may supply static and/or whole dynamic targets, but no sub-targets. |
envir |
Environment to use. Defaults to the current
workspace, so you should not need to worry about this
most of the time. A deep copy of |
verbose |
Integer, control printing to the console/terminal.
|
hook |
Deprecated. |
cache |
drake cache as created by |
fetch_cache |
Deprecated. |
parallelism |
Character scalar, type of parallelism to use.
For detailed explanations, see
You could also supply your own scheduler function
if you want to experiment or aggressively optimize.
The function should take a single
|
jobs |
Maximum number of parallel workers for processing the targets.
You can experiment with |
jobs_preprocess |
Number of parallel jobs for processing the imports and doing other preprocessing tasks. |
packages |
Character vector packages to load, in the order
they should be loaded. Defaults to |
lib_loc |
Character vector, optional.
Same as in |
prework |
Expression (language object), list of expressions,
or character vector.
Code to run right before targets build.
Called only once if |
prepend |
Deprecated. |
command |
Deprecated. |
args |
Deprecated. |
recipe_command |
Deprecated. |
timeout |
|
cpu |
Same as the |
elapsed |
Same as the |
retries |
Number of retries to execute if the target fails.
Assign target-level retries with an optional |
force |
Logical. If |
log_progress |
Logical, whether to log the progress
of individual targets as they are being built. Progress logging
creates extra files in the cache (usually the |
graph |
Deprecated. |
trigger |
Name of the trigger to apply to all targets.
Ignored if |
skip_targets |
Logical, whether to skip building the targets
in |
skip_imports |
Logical, whether to totally neglect to
process the imports and jump straight to the targets. This can be useful
if your imports are massive and you just want to test your project,
but it is bad practice for reproducible data analysis.
This argument is overridden if you supply your own |
skip_safety_checks |
Logical, whether to skip the safety checks on your workflow. Use at your own peril. |
lazy_load |
An old feature, currently being questioned.
For the current recommendations on memory management, see
If |
session_info |
Logical, whether to save the |
cache_log_file |
Name of the CSV cache log file to write.
If |
seed |
Integer, the root pseudo-random number generator
seed to use for your project.
In To ensure reproducibility across different R sessions,
On the first call to |
caching |
Character string, either
|
keep_going |
Logical, whether to still keep running |
session |
Deprecated. Has no effect now. |
pruning_strategy |
Deprecated. See |
makefile_path |
Deprecated. |
console_log_file |
Deprecated in favor of |
ensure_workers |
Deprecated. |
garbage_collection |
Logical, whether to call |
template |
A named list of values to fill in the |
sleep |
Optional function on a single numeric argument To conserve memory, For parallel processing, The To sleep for the same amount of time between checks,
you might supply something like |
hasty_build |
Deprecated |
memory_strategy |
Character scalar, name of the
strategy
For even more direct
control over which targets |
spec |
Deprecated. |
layout |
Deprecated. |
lock_envir |
Deprecated in |
history |
Logical, whether to record the build history
of your targets. You can also supply a
|
recover |
Logical, whether to activate automated data recovery.
The default is
How it works: if
If both conditions are met,
Functions |
recoverable |
Logical, whether to make target values recoverable
with |
curl_handles |
A named list of curl handles. Each value is an
object from
|
max_expand |
Positive integer, optional.
|
log_build_times |
Logical, whether to record build_times for targets.
Mac users may notice a 20% speedup in |
format |
Character, an optional custom storage format for targets
without an explicit |
lock_cache |
Logical, whether to lock the cache before running |
log_make |
Optional character scalar of a file name or
connection object (such as |
log_worker |
Logical, same as the |
In drake
, make()
has two stages:
Configure a workflow to your environment and plan.
Build targets.
The drake_config()
function just does step (1),
which is a common requirement for not only make()
,
but also utility functions like vis_drake_graph()
and outdated()
. That is why drake_config()
is a requirement for the _drake.R
script, which
powers r_make()
, r_outdated()
, r_vis_drake_graph()
, etc.
A configured drake
workflow.
make(recover = TRUE, recoverable = TRUE)
powers automated data recovery.
The default of recover
is FALSE
because
targets recovered from the distant past may have been generated
with earlier versions of R and earlier package environments
that no longer exist.
How it works: if recover
is TRUE
,
drake
tries to salvage old target values from the cache
instead of running commands from the plan.
A target is recoverable if
There is an old value somewhere in the cache that shares the command, dependencies, etc. of the target about to be built.
The old value was generated with make(recoverable = TRUE)
.
If both conditions are met, drake
will
Assign the most recently-generated admissible data to the target, and
skip the target's command.
make()
, drake_plan()
, vis_drake_graph()
## Not run:
isolate_example("quarantine side effects", {
if (requireNamespace("knitr", quietly = TRUE)) {
writeLines(
c(
"library(drake)",
"load_mtcars_example()",
"drake_config(my_plan, targets = c(\"small\", \"large\"))"
),
"_drake.R" # default value of the `source` argument
)
cat(readLines("_drake.R"), sep = "\n")
r_outdated()
r_make()
r_outdated()
}
})
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.