knitr::opts_chunk$set( collapse = TRUE, fig.path = "man/figures/vignette-", comment = "#>", dpi = 120 ) options(drake_make_menu = FALSE, drake_clean_menu = FALSE)
The goal of drakepkg is to demonstrate how a drake workflow can be organized as an R package. Users who are not yet familiar with drake should review the package's User Manual before continuing with this vignette.
The following examples illustrate the way that drake workflow's can be reproduced when they're included in an R package.
This example borrows the main example from the drake package documentation and recreates it within an R package.
The plan is included in drakepkg as a function:
library(drakepkg) # devtools::install_packages("tiernanmartin/drakepkg") get_example_plan_simple()
Here are the steps needed to reproduce this plan:
copy_drakepkg_files()make(get_example_plan_simple())drake functions like readd() or loadd()documents/ directory)The first step is optional but strongly recommended; it is generally accepted as a best practice that data analysis projects should be self-contained.
The second step is an important one. Most drake plans interact with the user's file system at some point, typically to read inputs or write outputs. drakepkg's inst/ directory contains the files and directories that are needed to successfully make get_example_plan_simple(). The copy_drakepkg_files() function copies the following directories from drakepkg into the user's working directory:
copy_drakepkg_files()
.
├── documents
├── extdata
└── intdata
├── R
│ └── make-iris-internal.R
└── iris-internal.xlsx
The third step is to make the plan:
clean(destroy = TRUE)
make(get_example_plan_simple())
The worflow's dependency graph can be displayed using drake::vis_drake_graph():
get_example_plan_simple() %>% drake_config() %>% vis_drake_graph()
The final output of the plan above is the report target but any of the targets
can be accessed using drake functions like
loadd() or readd().
# retrieve a target from the drake cache and inspect it loadd(fit) summary(fit) # inspect a target without storing it in the local environment readd(hist)
The second example builds on the first by introducing external data. The drake cache automatically stores a copy of each target in a plan, but when the plan accessess data from
an external source it's a good idea to store a local copy of that data in addition to the cached copy.
The following plan downloads the iris dataset from a github repository and stores it in the extdata directory in the user's working directory, like so:
.
├── documents
├── extdata
| └── iris-external.xlsx <-- file downloaded in the plan is stored here
└── intdata
├── R
│ └── make-iris-internal.R
└── iris-internal.xlsx
Here is the plan:
clean(destroy = TRUE)
make(get_example_plan_external())
get_example_plan_external() %>% drake_config() %>% vis_drake_graph()
The visualization below shows that the new "iris" data is actually just random
numbers:
readd(hist)
(work in progress)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.