knitr::opts_chunk$set( collapse = TRUE, fig.path = "man/figures/vignette-", comment = "#>", dpi = 120 ) options(drake_make_menu = FALSE, drake_clean_menu = FALSE)
The goal of drakepkg
is to demonstrate how a drake
workflow can be organized as an R package. Users who are not yet familiar with drake
should review the package's User Manual before continuing with this vignette.
The following examples illustrate the way that drake
workflow's can be reproduced when they're included in an R package.
This example borrows the main
example from the drake
package documentation and recreates it within an R package.
The plan is included in drakepkg
as a function:
library(drakepkg) # devtools::install_packages("tiernanmartin/drakepkg") get_example_plan_simple()
Here are the steps needed to reproduce this plan:
copy_drakepkg_files()
make(get_example_plan_simple())
drake
functions like readd()
or loadd()
documents/
directory)The first step is optional but strongly recommended; it is generally accepted as a best practice that data analysis projects should be self-contained.
The second step is an important one. Most drake
plans interact with the user's file system at some point, typically to read inputs or write outputs. drakepkg
's inst/
directory contains the files and directories that are needed to successfully make get_example_plan_simple()
. The copy_drakepkg_files()
function copies the following directories from drakepkg
into the user's working directory:
copy_drakepkg_files()
. ├── documents ├── extdata └── intdata ├── R │ └── make-iris-internal.R └── iris-internal.xlsx
The third step is to make the plan:
clean(destroy = TRUE)
make(get_example_plan_simple())
The worflow's dependency graph can be displayed using drake::vis_drake_graph()
:
get_example_plan_simple() %>% drake_config() %>% vis_drake_graph()
The final output of the plan above is the report
target but any of the targets
can be accessed using drake
functions like
loadd()
or readd()
.
# retrieve a target from the drake cache and inspect it loadd(fit) summary(fit) # inspect a target without storing it in the local environment readd(hist)
The second example builds on the first by introducing external data. The drake
cache automatically stores a copy of each target in a plan, but when the plan accessess data from
an external source it's a good idea to store a local copy of that data in addition to the cached copy.
The following plan downloads the iris
dataset from a github repository and stores it in the extdata
directory in the user's working directory, like so:
. ├── documents ├── extdata | └── iris-external.xlsx <-- file downloaded in the plan is stored here └── intdata ├── R │ └── make-iris-internal.R └── iris-internal.xlsx
Here is the plan:
clean(destroy = TRUE)
make(get_example_plan_external())
get_example_plan_external() %>% drake_config() %>% vis_drake_graph()
The visualization below shows that the new "iris
" data is actually just random
numbers:
readd(hist)
(work in progress)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.