knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
library(dplyr)
library(svFlow)

This document still need substantial editing! It is left here because it may still be useful in its current state.

Here is a pipeline:

library(dplyr)
library(svFlow)
threshold <- 1.5
iris %>.%
  filter(., Petal.Length > threshold) %>.%
  mutate(., log_var = log(Petal.Length)) %>.%
  head(.) %>.% .

Use of flow() to add local variables inside the pipeline, and to have convenient and transparent resolution of the lazyeval mechanism:

flow(iris, var_ = Petal.Length, threshold = 1.5) %>_%
  filter(., var_ > threshold_) %>_%
  {..$tab <- mutate(., log_var = log(var_))} %>_%
  head(.) %>_% .

Convert this into a reusable function by replacing flow() by function() and starting the pipeline with enflow():

my_process <- function(data, var_ = Petal.Length, threshold = 1.5)
  enflow(data) %>_%
  filter(., var_ > threshold_) %>_%
  {..$tab <- mutate(., log_var = log(var_))} %>_%
  head(.) %>_% .

Then, you use it just as a plain function. The arguments ending with _ are also a good way to immediately spot those who are treated specially by the tidyeval mechanism! Here, we redo the analysis:

my_process(iris)

Here, we change the variable and the threshold:

my_process(iris, var_ = Sepal.Width, threshold = 0.5)

Flow objects to test alternate scenarios

The Flow objects can be subclassed. This is a little bit similar to branches of git repositories (although with automatic updates from master branch). It could be nice to keep this comparison as close as possible and to make both approaches conceptually similar, so that one can work similarly with flow() and with git?! We need tools to create, delete, switch to, merge (into master only?), and rebase + diff.

The biggest difference is that branches in git terms do not dynamically inherit objects from parents but proto/Flow objects do (in this case, main branch is indeed the ancestor). So, we could create a function branch() that does something like this:

# Create a new branch called lda
#..$branch("lda", model = mlLda(Species ~ ., data = .))
# Switching to a branch or referring to a branch: use one of those two syntaxes
#..@lda(model = mlLda(Species ~ ., data = .))
#..(lda)(model = mlLda(Species ~ ., data = .))

Visual map of Flow objects inheritance

TODO: in an older version of the proto package, there was a nice graph.proto() function, and I made my own one with other dependencies => reimplement it for {svFlow} in order to show the workflow in a similar way a git repository is often depicted.



SciViews/flow documentation built on May 4, 2024, 6:22 a.m.