We have seen how mutually recursive functions can implement many control flows.
Each of the cps_
functions as well as R_
are constructors, one for each node in the syntax tree, with one argument for each child node. These node constructors return a second set of constructors. This second set of constructors
This second set of constructors all take a continuation function as first argument, other
This might be illustrated better as ...
So, for instance, cps_for()
receives as arguments the constructors for each of the for
loop's arguments. Then at top level
loop <- `{_cps`( R_(j <- 0), for_cps(R_(i), R_(1:10), `{_cps`( R_(j <- j + i), yield_cps(R_(j)) )) )
This nest of mutually recursive functions can be directly called in a loop (pump.r
); this is a form of direct interpreter, and is how the async
worked in release 0.2.
But besides being executable, the nest of functions that has been constructed is data; each node in the graph is a closure object, with an expression an arguments that exist relative to an environment, and all of these objects being introspectable and accessible as data from R. These objects will form the input to the compilation process, and will serve as the "intermediate representation" through a few stages.
Having a directly executable intermediate form has testing benefits; I should be able to run my tests now on the directly interpreted generators, then after 1 pass of compilation the result should still be directly interpretable, and we can re-run the same tests again; and so in until code generation. This will let me build the compiler in smaller chunks, while still getting useful feedback and something "working" as I go along.
We have closures spanning a computational graph, first step is to catalogue the nodes and edges in the graph.
The function walk()
does this, starting with the entry function, visiting each node of the graph in turn. This is done in depth-first order, with walk
assigning a unique name to each node according to the path that first reached it. walk
has to detect when two paths lead to the same node.
walk
relies on all_names
which is a souped-up version of base all.names
; it additionally classifies the "role" in which each name appears, for example: as a variable, as a target of <-
, as target of <<-
, or as head of a function call, as head of a call in tail position. Note that all_names
cannot do a foolproof classification of what variables are used in any given R function without executing the function, so I have to accommodate that by writing each node in s style that that all_names
can "follow." This means the nodes have to be written in a sort of dialect of "simple R" that doesn't do much funny in terms of nonstandard or lazy evaluation, and that only makes calls to other node functions in tail position.
switch
. The most natural way to implement switch
is to hold all your continuations in an array and index into that array, but this is a kind of indirection that all_names
can't reasonably follow. The solution will be to cons up a compilable switch statement when the "switch" is being constructed.val
.r ret(cont, ...)
which unwinds the stack and jumps back in at r cont(...)
. Any loop-like construct should use ret
once per loop.all_names
will try to look up the functions you are calling in the environment of a given node function. If a function called in tail position has formals cont
and ...
args, it is considered a trampoline. walk
should collect information on both the trampoline-setting call and the trampolined call.on.exit
or return
or other stuff that reifies the duration of a function call.While R does not actually have tail call elimination, a notion of "tail call" is used in the async
compiler. all_names
implements a notion of a tail call; according to all_names
, a call is in tail position if it appears only in the last argument of the outermost set of {}
braces, on a branch of a switch
or if
, or in the second branch of a ||
or &&
; . If it is nested in any other kind of call it is not in tail position. A function that is deeply nested must be a tail argument of all enclosing functions to be considered a tail call. In this example:
function(val) { checkVal() if (val %% 2 == 1) { checkOdd(val) && plinkle(val*3-1) } else { checkEven(val) || tinkle(val/2) } }
In the above snippet, the calls plinkle(val)
and tinkle(val/2)
are in tail position, because they appear in the second args of ||
and &&
, in the final statement of {}
in, in a branch of if
, which is the final statement in {
. checkOdd
is not in tail position because it is in the first, rather than second argument of &&
; checkVal
is not in tail position because it is not in the last argument of its enclosing {}
. Meanwhile, /
is not in tail position because it appears in the args of tinkle
(even though tinkle
is in tail position.)
This notion will be useful because a tail call is a natural place to "splice" one function to the next. When evaluation reaches a tail position, the function is bound to return whatever the tail call returns. This means that upon making the call, anything that local variables in the function can be ignored from that point forward (again, this being in our "simple R" dialect where we avoid lazy eval shenanigans.) This means that at a later stage (this is also why a call in return
is not considered to be a tailcall, because splicing functions together would mean returns are called from different places.
When collecting the graph, tail calls are considered to be state transitions. When writing node functions, we only make calls to other node functions in tail position; walk
will only consider those to be state transitions.
Having collected the shape of the graph and its nodes, the first thing to do with this graph is to make a visual representation of it. This will be an instructive and diagnostic tool, as well as an aid to writing further compilation phases. If a reasonable drawing can be made of how a generator works as a graph, then I might have captured sufficient information to compile it. So walk
and compileGraph
were largely written in parallel; the former collecting information about the graph, and the latter generating a Graphviz DOT format file from this information.
compileGraph(walk(g), "filename.dot")
will spit out a Graphviz DOT file which can then be processed with the dot
command to draw a graph. (However this package will not add a dependency on Graphviz.)
As drawn in the graphs above, each instance of a construct like for
shares a context among its function nodes and hidden variables. This context is represented by the dotted lines. When directly executed, each node is a closure object, so that its variables. In this way two "for" loops can be nested without interfering with each other, because each creates its own nodes that have their own enclosing environment for its hidden state variables.
In the compiled form, all these nodes will be merged into the same namespace in a big switch
statement. Also, most optimization stages should go smoother if I don't have to think about names existing in different namespaces. So the first compiler pass is name-munging; placing all nodes into the same environment. The graph walker, walk
has already assigned a unique name to each node and context, as well as the local names and global targets of its tailcalls, what global names they correspond to; and all closed-over variables that are used.
In name munging I need to:
It's not clear whether to work on this step before or after code generation. Right now I am thinking that if optimization is done on a closxp-based representation of the graph, I can skip over it to get code generation sorted first. Or, since the optimized graph should still be directly executable, I should be able to do optimizations before code generation. But I have the most questions about how code generation works, so that seems like the likely next target.
The main optimizations are inlining of constants and tailcalls. Any state variables that are read from and not written should be inlined. You can do this with a "quoting environment" again; replace a call to cont(thing)
with {val <- thing; (...)}
with the contents of cont
.
At the very least, any node that has only one incoming connection should be inlined into its parent. At the most, any path transition that does not invoke a trampoline can be inlined (possibly duplicating some paths, but making fewer states in the bargain), This will effectively remove some states from the graph, so following this step, re-do the transitive closure (or even just re-walk based on your new START state) you might check the transitive closure starting from START and remove stuff that's no longer in the closure.
Now, in theory, you can go over the graph and take each function,
cont(val)
with assignments to a variables val
and state
quo_ <- function (expr, env) as.call(list(function() eval(expr, env)))
which is one callsxp containing a CLOSXP which is closed over an ENVSXP with two vars. Should be efficient enough, The point of this will be that you can inject one of these _quo s into an expression, or eval() it and it should come back with the right value, giving a level of "hygeine" to evaluations. And unlike using a raw promsxp, a closure will generally print okay.
Now, a question I have is about how to handle the R_
functions because they will be working in the same namespace as the rest of the code. If they take the form of "eval" I might be able to just unwrap them. The question is whether I can do this early or late. I might want to generate R_
functions that are bound to the target environment? Or just directly label the R_ closxps with an attribute that the compiler can pick up?
-->
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.