Pipeline config is stored in the config/pipeline.yaml file. Directory with this file is read from SCDRAKE_PIPELINE_CONFIG_DIR environment variable upon {scdrake} load or attach, and saved as scdrake_pipeline_config_dir option. This option is used as the default argument value in several {scdrake} functions.

Pipeline parameters don't have impact on analysis (except SEED, but that should be reproducibly treated by {drake}). Parameters starting with DRAKE_ are generally passed to drake::make() or drake::drake_config().


General parameters

DRAKE_TARGETS: null

Type: character vector or null

Array of target names to make. Setting to null will make all targets. Example for single-sample pipeline / stage 02_norm_clustering reports:

DRAKE_TARGETS: ["report_norm_clustering", "report_norm_clustering_simple"]

DRAKE_CACHE_DIR: ".drake"

Type: character scalar or null

A name of directory to store drake's cache in. If null, the default directory ".drake" will be used.


DRAKE_KEEP_GOING: False

Type: logical scalar

If True, let the pipeline continue even if some target fails.


DRAKE_VERBOSITY: 1

Type: integer scalar (1 | 2 | 3)

Verbosity of {drake}:


DRAKE_LOCK_ENVIR: True

Type: logical scalar

{drake} locks R global environment to avoid its unwanted modifications by targets. However, in some cases is needed to keep it unlocked.


DRAKE_UNLOCK_CACHE: True

Type: logical scalar

Don't wait for {drake} to discover locked cache after pipeline is run and unlock it immediately.


DRAKE_FORMAT: "rds"

Type: character scalar

A file format used to store intermediate results in DRAKE_CACHE_DIR. See https://books.ropensci.org/drake/plans.html#special-data-formats-for-targets for more details.

By default, R's Rds format is used (see ?saveRds), but we recommend to use DRAKE_FORMAT: "qs" (see https://github.com/traversc/qs) which offers better performance, but sometimes doesn't work correctly (drake throws untraceable errors).


DRAKE_REBUILD: null

Type: character scalar ("all" | "current") or null

Instruct {drake} to rebuild targets although they are considered finished.


DRAKE_CACHING: "worker"

Type: character scalar ("worker" | "main")

How to collect data from parallel workers. See the caching parameter in drake::drake_config().


DRAKE_MEMORY_STRATEGY: "speed"

Type: character scalar ("speed" | "autoclean" | "preclean" | "lookahead" | "unload" | "none")

How to manage target objects in memory during runtime. See the memory_strategy parameter in drake::drake_config().

You can consider "autoclean", "preclean" or "lookahead" to conserve memory, but at the expense of speed.


DRAKE_LOG_BUILD_TIMES: False

Type: logical scalar

Whether to record build times for targets. Mac users may notice a 20% speedup in drake::make() with DRAKE_LOG_BUILD_TIMES: False.


BLAS_N_THREADS: null

Type: positive integer scalar or null

A maximum number of threads for BLAS operations, passed to RhpcBLASctl::blas_set_num_threads(). Prevents "BLAS : Program is Terminated. Because you tried to allocate too many memory regions" when a massive target parallelism is used. Set to null if you want to keep BLAS defaults.


RSTUDIO_PANDOC: null

Type: character scalar or null

A path to directory with pandoc's binary which is required for rendering of HTML reports.

You can ignore this if:

In {rmarkdown}, the used pandocs binary is then resolved by rmarkdown::find_pandoc().


SEED: 100

Type: integer scalar

An initial seed for random number generator.


Parallelism

DRAKE_PARALLELISM: "loop"

Type: character scalar ("loop" | "future" | "clustermq")

Type of {drake} paralellism.



DRAKE_CLUSTERMQ_SCHEDULER: "multicore"

Type: character scalar

Which scheduler to use if DRAKE_PARALLELISM is "clustermq". See https://mschubert.github.io/clustermq/articles/userguide.html#configuration for possible values.


DRAKE_N_JOBS: 4

Type: positive integer scalar

A number of parallel jobs for drake.


DRAKE_N_JOBS_PREPROCESS: 4

Type: positive integer scalar

A number of parallel jobs for processing the imports and doing other preprocessing tasks.


WITHIN_TARGET_PARALLELISM: False

Type: logical scalar

Allow or disable within-target parallelism through the BiocParallel package. Only possible when DRAKE_PARALLELISM is "loop".


N_CORES: 4

Type: positive integer scalar

A number of cores for within-target parallelism.


Targets

An informative plan is binded with every other plan, and contains targets with useful runtime information. See the Targets section in vignette("config_main").



bioinfocz/scdrake documentation built on Sept. 19, 2024, 4:43 p.m.