aqueduct: Run an aqueduct Workflow

Description Usage Arguments Details Value Examples

View source: R/aqueduct.R

Description

This function runs the workflow provided in the arguments, ignoring components of the workflow (nodes) whose parent nodes have not changed

Usage

1
aqueduct(..., verbose = FALSE)

Arguments

...

Formulas, separated by commas, that dictate the workflow of the project. Each formula is written as

outdir(out) ~ node(indir1(in1) + indir2(in2) + ..., options)

where the use of the components of the formula are described below.

verbose

FALSE by default. Returns all output of code while running.

Details

Formula Arguments

An individual formula is made up of the following components

Formula Options

The options section of the formula contains additional arguments to pass on to the node file.

Value

workflow

A dataframe of all the workflow nodes with the previous timestamp before running, and the timestamp after running

plot

A Plot displaying a directed acyclic graph (DAG) of the workflow

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# Let's say we are working for a government agency that provides affordable
# housing to low-income individuals. We want to determine if there are
# subgroups of the population that disproportionately drop out of the
# program despite being eligible. The workflow is: 
#    1. Clean and merge characteristic files of individuals eligible 
#       for program using a file called load_chars.R
#    2. Merge characteristic files onto the main database file that contains
#       information on whether or not eligible individuals participated in
#       the program, and if so, how long they participated.
#    3. Run a SVM to classify groups that are disproportionately likely to
#       drop out of the program
#    4. Produce plots based on these results and the underlying
#       characteristics of the population
#    5. Produce a knitted document displaying these results
# And the filepath is as follows, starting from the basepath:
#    /finding_groups
#    ----/code
#    --------/clean
#    ------------/load_chars.R
#    ------------/clean_chars.R
#    --------/build
#    ------------/add_chars.R
#    --------/analyze
#    ------------/svm_classify.R
#    ------------/create_plots.R
#    ------------/produce_report.Rmd
#    ----/data
#    --------/raw
#    ------------/main_db.csv
#    ------------/chars
#    ----------------/location_file.csv
#    ----------------/race_file.csv
#    ----------------/age_file.csv
#    ----------------/education_file.csv
#    --------/derived
#    --------/current
#    ----/output
#    --------/plots

# First, set paths using aqueduct_setup()
aqueduct_setup(
  basepath = "C:/Users/Harvey/GitHub/aqueduct/examples/example1",
  raw      ~ basepath/data/raw,
  derived  ~ basepath/data/derived,
  current  ~ basepath/data/current,
  plots    ~ basepath/output/plots
)
# Then run the aqueduct workflow!
aqueduct(
  raw() ~ create_data(,seed=1996)
  derived(chars) ~ load_chars(raw(location_file) +
                              raw(race_file) +
                              raw(age_file) +
                              raw(education_file)),
  derived(clean_chars) ~ clean_chars(derived(chars)),
  derived(db_w_chars) ~ add_chars(raw(main_db) + derived(clean_chars)),
  current(classified_groups) ~ svm_classify(derived(db_w_chars)),
  plots(classify_plots) ~ create_plots(current(classified_groups) +
                                       derived(clean_chars)),
  output(final_report)  ~ produce_report() 
)

harveybarnhard/aqueduct documentation built on Jan. 1, 2021, 3:15 a.m.