Description Usage Arguments Details Value Examples
This function runs the workflow provided in the arguments, ignoring components of the workflow (nodes) whose parent nodes have not changed
1 |
... |
Formulas, separated by commas, that dictate the workflow of the project. Each formula is written as outdir(out) ~ node(indir1(in1) + indir2(in2) + ..., options) where the use of the components of the formula are described below. |
verbose |
|
An individual formula is made up of the following components
node
The name of the .R file as it exists in the code folder as
defined in aqueduct_setup()
. Do not include the .R file
extension.
indir1, indir2
The names of the directories for the first and
second input file where the names are defined in aqueduct_setup()
in1, in2
The names of the first and second input files, without
extensions, that are located in indir1
and indir2
,
respectively. Most of the time, the files will be in .csv format, but
aqueduct()
will also read in .xlsx and .dta file formats without a
need to specify the extension. If the input files are not found in the
the input directories, then the files are looked for in the sub directories
of the input directories.
outdir
The name of the directory for the output file.
out
The name of the output file created by node
.
aqueduct()
is smart in that it determines the file format of the
output and saves it as the appropriate file format. For the most part, the
file output will be a single .csv file. When the main function in
node
outputs a list of dataframe style objects, a .csv file is saved
in outdir
and the files are saved with the file name
[name_in_list].csv where name_in_list is the name given for that object
in the list. When this is the case, specifying outdir()
is sufficient.
If the file is not a dataframe object, matrix, or vector, then
the object will be saved as a .RData object.
The options section of the formula contains additional arguments to pass
on to the node
file.
workflow |
A dataframe of all the workflow nodes with the previous timestamp before running, and the timestamp after running |
plot |
A Plot displaying a directed acyclic graph (DAG) of the workflow |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | # Let's say we are working for a government agency that provides affordable
# housing to low-income individuals. We want to determine if there are
# subgroups of the population that disproportionately drop out of the
# program despite being eligible. The workflow is:
# 1. Clean and merge characteristic files of individuals eligible
# for program using a file called load_chars.R
# 2. Merge characteristic files onto the main database file that contains
# information on whether or not eligible individuals participated in
# the program, and if so, how long they participated.
# 3. Run a SVM to classify groups that are disproportionately likely to
# drop out of the program
# 4. Produce plots based on these results and the underlying
# characteristics of the population
# 5. Produce a knitted document displaying these results
# And the filepath is as follows, starting from the basepath:
# /finding_groups
# ----/code
# --------/clean
# ------------/load_chars.R
# ------------/clean_chars.R
# --------/build
# ------------/add_chars.R
# --------/analyze
# ------------/svm_classify.R
# ------------/create_plots.R
# ------------/produce_report.Rmd
# ----/data
# --------/raw
# ------------/main_db.csv
# ------------/chars
# ----------------/location_file.csv
# ----------------/race_file.csv
# ----------------/age_file.csv
# ----------------/education_file.csv
# --------/derived
# --------/current
# ----/output
# --------/plots
# First, set paths using aqueduct_setup()
aqueduct_setup(
basepath = "C:/Users/Harvey/GitHub/aqueduct/examples/example1",
raw ~ basepath/data/raw,
derived ~ basepath/data/derived,
current ~ basepath/data/current,
plots ~ basepath/output/plots
)
# Then run the aqueduct workflow!
aqueduct(
raw() ~ create_data(,seed=1996)
derived(chars) ~ load_chars(raw(location_file) +
raw(race_file) +
raw(age_file) +
raw(education_file)),
derived(clean_chars) ~ clean_chars(derived(chars)),
derived(db_w_chars) ~ add_chars(raw(main_db) + derived(clean_chars)),
current(classified_groups) ~ svm_classify(derived(db_w_chars)),
plots(classify_plots) ~ create_plots(current(classified_groups) +
derived(clean_chars)),
output(final_report) ~ produce_report()
)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.