Pipeline config is stored in the config/pipeline.yaml
file. Directory with this file is read from SCDRAKE_PIPELINE_CONFIG_DIR
environment variable upon {scdrake}
load or attach, and saved as scdrake_pipeline_config_dir
option.
This option is used as the default argument value in several {scdrake}
functions.
Pipeline parameters don't have impact on analysis (except SEED
, but that should be reproducibly treated by {drake}
).
Parameters starting with DRAKE_
are generally passed to drake::make()
or drake::drake_config()
.
DRAKE_TARGETS: null
Type: character vector or null
Array of target names to make. Setting to null
will make all targets.
Example for single-sample pipeline / stage 02_norm_clustering
reports:
DRAKE_TARGETS: ["report_norm_clustering", "report_norm_clustering_simple"]
DRAKE_CACHE_DIR: ".drake"
Type: character scalar or null
A name of directory to store drake's cache in.
If null
, the default directory ".drake"
will be used.
DRAKE_KEEP_GOING: False
Type: logical scalar
If True
, let the pipeline continue even if some target fails.
DRAKE_VERBOSITY: 1
Type: integer scalar (1
| 2
| 3
)
Verbosity of {drake}
:
0
: print nothing.1
: print target-by-target messages as make()
progresses.2
: show a progress bar to track how many targets are done so far.DRAKE_LOCK_ENVIR: True
Type: logical scalar
{drake}
locks R global environment to avoid its unwanted modifications by targets.
However, in some cases is needed to keep it unlocked.
DRAKE_UNLOCK_CACHE: True
Type: logical scalar
Don't wait for {drake}
to discover locked cache after pipeline is run and unlock it immediately.
DRAKE_FORMAT: "rds"
Type: character scalar
A file format used to store intermediate results in DRAKE_CACHE_DIR
.
See https://books.ropensci.org/drake/plans.html#special-data-formats-for-targets for more details.
By default, R's Rds
format is used (see ?saveRds
), but we recommend to use DRAKE_FORMAT: "qs"
(see https://github.com/traversc/qs) which offers better performance, but sometimes doesn't work correctly
(drake
throws untraceable errors).
DRAKE_REBUILD: null
Type: character scalar ("all"
| "current"
) or null
Instruct {drake}
to rebuild targets although they are considered finished.
"all"
, the pipeline is run from scratch (drake::trigger(condition = TRUE)
is passed as trigger
argument to
drake::make()
or drake::drake_config()
)."current"
, drake::clean()
is run for targets specified in DRAKE_TARGETS
.DRAKE_CACHING: "worker"
Type: character scalar ("worker"
| "main"
)
How to collect data from parallel workers. See the caching
parameter in drake::drake_config()
.
DRAKE_MEMORY_STRATEGY: "speed"
Type: character scalar ("speed"
| "autoclean"
| "preclean"
| "lookahead"
| "unload"
| "none"
)
How to manage target objects in memory during runtime. See the memory_strategy
parameter in drake::drake_config()
.
You can consider "autoclean"
, "preclean"
or "lookahead"
to conserve memory, but at the expense of speed.
DRAKE_LOG_BUILD_TIMES: False
Type: logical scalar
Whether to record build times for targets. Mac users may notice a 20% speedup in drake::make()
with
DRAKE_LOG_BUILD_TIMES: False
.
BLAS_N_THREADS: null
Type: positive integer scalar or null
A maximum number of threads for BLAS operations, passed to RhpcBLASctl::blas_set_num_threads()
.
Prevents "BLAS : Program is Terminated. Because you tried to allocate too many memory regions" when a massive target
parallelism is used. Set to null
if you want to keep BLAS defaults.
RSTUDIO_PANDOC: null
Type: character scalar or null
A path to directory with pandoc's binary which is required for rendering of HTML reports.
You can ignore this if:
Scdrake
is run in its Docker container.scdrake
from RStudio (it has pandoc
bundled).pandoc
is available in the PATH
environment variable. You can check this by calling system("pandoc -v")
.In {rmarkdown}
, the used pandoc
s binary is then resolved by rmarkdown::find_pandoc()
.
SEED: 100
Type: integer scalar
An initial seed for random number generator.
DRAKE_PARALLELISM: "loop"
Type: character scalar ("loop"
| "future"
| "clustermq"
)
Type of {drake}
paralellism.
DRAKE_CLUSTERMQ_SCHEDULER: "multicore"
Type: character scalar
Which scheduler to use if DRAKE_PARALLELISM
is "clustermq"
.
See https://mschubert.github.io/clustermq/articles/userguide.html#configuration for possible values.
DRAKE_N_JOBS: 4
Type: positive integer scalar
A number of parallel jobs for drake.
DRAKE_N_JOBS_PREPROCESS: 4
Type: positive integer scalar
A number of parallel jobs for processing the imports and doing other preprocessing tasks.
WITHIN_TARGET_PARALLELISM: False
Type: logical scalar
Allow or disable within-target parallelism through
the BiocParallel package.
Only possible when DRAKE_PARALLELISM
is "loop"
.
N_CORES: 4
Type: positive integer scalar
A number of cores for within-target parallelism.
An informative plan is binded with every other plan, and contains targets with useful runtime information.
See the Targets section in vignette("config_main")
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.