Generator's print method is not finding the "state" variable that nextOr uses, because it was munged to 'nextOr|state'.
Is nextOr
part of the munging? really? I guess I think/thought it and pump
would have to be namespace-munged.
Okay, I think the answer is to add a "getState" function to the start set. The start set is really "nodes that we have to carry over (whether they're in the graph of not) and have consistent names."
This would mean that runPump was not translated? But my thingy says it's translated. It should be moving over with the nodes. Am I tracing that?
But wait, the nodes are moved last! They should move first, perhaps. But wait, then, what thing is my translation finding? It shouldn't find a function there at all?
Mysteries. I changed the order and...
It is still doing it.
> gc <- compile(g, level=-1) > graph <- walk(g) > graphc <- walk(gc) > length(graphc$contexts) [1] 2 > names(graphc$contexts) [1] "runPump|" "entry|"
But the log said:
Context: stop_| Moving nodes: Node: `stop_|`$`stop_` -> `stop_` Node: `stop_|`$`runPump` -> `runPump` Static pointer?: `runPump` -> `stop_|runPump` with translated reference: `stop_|`$`runPump` -> `runPump`
Weird... yeah, just runPump is failing to be ported?
I have both the pre- and post-compiled graphs;
> (graph$contexts[["stop_|"]][["runPump"]]) function(...) { ... } <environment: 0x55ed7e7ea790> > graphc$contexts[["runPump|"]]$pump function(...) { ... } <environment: 0x55ed7e7ea790>
Hold on. There's also a "runPump" in the normal context. They are different:
graphc$contexts[['entry|']][["runPump"]] function (...) { assert(`stop_|action` == "pause", paste0("pump asked to continue, but last action was ", `stop_|action`)) `stop_|pumpCont`(...) ... } <environment: 0x55ed760dfa48> > graphc$contexts[["runPump|"]]$runPump function(...) { assert(action == "pause", paste0("pump asked to continue, but last action was ", action)) pumpCont(...) # here's where you inject a return value into a yield? ... } <environment: 0x55ed7e7ea790>
To the translated one went into the right spot while the other one is being picked up as a node? OH! It's getting them from getStartSet
somehow?
lapply(getStartSet.generator(gc), environment) $entry <environment: 0x55ed760dfa48> ... $runPump <environment: 0x55ed7e7ea790>
Okay yeah it's coming from getStartSet
. How? A bad variable reference. Was using global g
instead of argument x
Okay, so after munging I have both "runPump" and "stop|runPump" defined. It looks like it is moving the node as well as data. It shouldn't do that. It's doing that because runPump is "read" as well as called.
Who does the reading? Is that in the graph?
> lapply(as.list(graph$contextNodes[[graph$nodeContexts[["runPump"]]]]), function(nodeName) graph$nodeProperties[[nodeName]]$read) ... $pump [1] "action" "nonce" "runPump" "value" ...
Okay, it's registering as a var because of the call to doWindup(runPump)
.
Options for fixing this:
doWindup
's args during walk
, walk
treat doWindup
as a trampoline call even if it's not the tail. I did 4. just needed to make it detect trampoline calls even outside the function tail.
Well we need trace
over there in the destination environment too. Why isn't trace gathered?
I need to also carry over "utility" functions, then. This works the same as moving nodes, I guess.
I made it copy over utility functions. Problem: this creates a bunch of actual duplicates of "trace,", bu thtis doesn't stop me right noe.
Bigger Problem: it is copying over "stop" and "pumpCont" as if they are utilities.
Context: stop_| Moving nodes: Node: `stop_|`$`stop_` -> `stop_` Node: `stop_|`$`runPump` -> `runPump` Node: `stop_|`$`return_` -> `return_` Node: `stop_|`$`ret_` -> `_3.do_.body.2.ret` Node: `stop_|`$`pump` -> `pump` Node: `stop_|`$`pause_` -> `_3.do_.body.1.yield.pause` Moving utils: Function: `stop_|`$`stop` -> `stop_|stop` Function: `stop_|`$`trace` -> `stop_|trace` Function: `stop_|`$`pumpCont` -> `stop_|pumpCont` Moving data: State var: `action` -> `stop_|action` State pointer: `pumpCont` -> `stop_|pumpCont` with translated reference: `entry|`$`R_` -> `entry` State var: `verbose` -> `stop_|verbose` State var: `nonce` -> `stop_|nonce` State var: `value` -> `stop_|value`
Checking with the graph:
graph$contextProperties[["stop_|"]] lapply(as.list(graph$contextNodes[["stop_|"]]), \ (node)as.list(graph$nodeProperties[[node]]))
pumpCont
is showing up as "read" as well as a state pointer. It's properly a state pointer
(and gets translated as such.)
So let's ignore utils that also show up as state pointers..?
It's still picking up stop
as a util though. Why? It's a reference to base "stop". I think that's correct.
base::stop
to base::`nextOr|stop`
object 'nextOr|stop' not found
in runPump. But, there is such a function?? Oh, get_function()
shows me the context: base::`nextOr|stop`
because translate() needs to ignore arguments to "::" or ":::", d'oh! I need something more sophisticated than substitute. So i pulled the trick from vadr to translate using an environment...
Sadly, this is noticeably slow compared to using substitute(). And it forces me to use temp vars in a couple spots that I do a double call (f(x)(y)
).
Now that we have iteror() we should go through generator and make sure that it doesn't unnecessarily invoke any tryCatch. This I did incidentally because it exposes the stack traces a little better.
pumpCont
is not being translated> g <- async:::compile(gen(for (i in 1:10) yield(i)), level=-1) > nextOr(g, "or") Error in pumpCont(..., envir = <environment>) : could not find function "pumpCont"
pumpCont
was translated where it appears as a variable but not where it appears as a call. Let's break "munge" when it gets to runPump and see what gives.
all_names(graph$nodes[[nodeName]], c("call", "store", "local", "arg", "var", "utility"))
Oh it's happening when I gather the context properties. I was doing
utils <- setdiff(graph$contextProperties[[contextName]]$utility, contextVars)
Which was incorrect.
With tracing on:
> gc <- async:::compile(gen(for (i in 1:10) yield(i)), level=-1) > nextOr(g, "or") R: i yield generator: yield pump: set unpause pump: pause Error in obj$nextOr(or, ...) : Generator finished unexpectedly
Do this to debug the next node:
> debug(environment(g$nextOr)$`stop_|pumpCont`) ls(environment(environment(gc$nextOr)$`stop_|pumpCont`))
(delete pages of fussing about this var not being translated or that one not being covered in walk
, etc.)
Okay now seems like the damn entry point stored in pumpCont is pointing out of scope. Ah. it was moved as an util after being moved as a data (among several other things)
I.e. you had each context return expressions for each of its states, and a list of names to be translated.
Most of the mechanics of substituting inlining, etc. would be the same, it's the "scanning things and automatically tagging vars" that would be skipped. So it seems like I'm far enough along path A that I wouldn't gain a whole lot trying to feel out path B just yet, though the idea of generating the dumb part automatically is appealing, but I can also throw memoization into the mix.
And just like that, I've had a nearly successful generator run after munging.
missing value where TRUE/FALSE needed
What's up with my graph? edgeProperties is giving me a "type: NULL" on some edge.
Oof, it's that thing where "c"/unlist adds numbers to names again. I've been trying to use "names" of a vector to track classes of a thing and there's some friction.
Nodes should not be named pumpCont
for instance.
Set up a global compiler option, Then repeat all generator tests with compilation off and on. I did this with symlinks... is there a better way? Doing source or test-file inside of a test file tended to have weird namespace errors.
for
loop over an iterator failsasyncOpts(compileLevel=-1, verbose=TRUE) x <- gen(for (i in icount()) {yield(i)}) nextElem(x)
i
returns a function, somehow. It's returning the iterator, not invoking it. It didn't recognize?
debug(environment(x$nextOr)$`_1.do_.body.1.yield`)
Ah, because the new environment is not enclosed in async
, method
is not finding methods I haven't exported. So for_cps
calls iteror
which was going to iteror.default
, because I hadn't @
exported iteror.iteror <- identity
.
Looks like munge.generator is pulling in resolve and reject (the callbacks the promise ctor gives us). For compilation we want to swap in a new promise there, though. I changed resolve_ and reject_ so these are no longer tail calls.
stop
is both a variable and a calllapply(names(graph$contextNodes[[contextName]]), \(x)as.list(graph$nodeProperties[[x]]))
Turns out I was doing a tryCatch({}, error=stop)
i.e. passing stop as a pointer; rewriting as
tryCatch({}, function(e) stop(e)
avoids confusing walk
this way..
I needed some way to attach the new resolve_
and reject_
pointers. I put
an accessor in the start set.
And on to the inevitable failures;
pump: run pump: windup pump: caught error by windup pump: stop: could not find function "pump#pumpCont" async: stop (rejecting) pump: unwind pump: pause > Unhandled promise error: could not find function "pump#pumpCont"
On inspection pump#pumpCont
is bound to NULL. Whereas in the pre-compiled generator
environment(a$state$pump)$pumpCont
If I turn on verbose, munge claims it was moved. What the hey? Okay, so the node is being moved after the state pointer. I filled the state pointer with a lazy value to work around that.
That moved the ball, now we have
Unhandled promise error: object 'key' not found
Ah, this is because delayedAssign
quotes its second argument. Lucky I wrote a whole package that will help me here!
Now the ball has moved to
no applicable method for 'conditionMessage' applied to an object of class "function"
This was due to name collision between argument err
and state variable err
. Don't name arguments the same as state variables you use, is the immediate lesson.
The compiled promise is not resolving; it's resolving the original promise, somehow.
I think the issue is that "utility" functions need to be translated IF they are closed in context.
But now I'm dealing with utility functions that are closed in other contexts.?
Wait, why is resolve
not a node? What uses it? It's passed to make_pump
as return
.
Oh. When I pass things to pump(return= and pump(stop=) they should be enumerated! Are they? They are now. But yet, it's hitting the wrong pump?
I should break in the "await" and see what callbacks it's registering. So the _1#await
callback is in a different environment. Hence
Okay yes so the await_
utility function should just be a trampoline, that way it gets translated, and I can pass an anon function to it.
Now after a bit more fussing I have a compiled async that did something! Still some errors though.
I turned "paranoid" on for tests at compile level -1. It checks that everything really got moved over!. I turned up ONE problem in async.
fails to walk() a compiled async containing try/finally upon finding a call to finally's cont
that wasn't translated. It's counted as a tailcall in the compiled graph;.
Sure enough it's not picked up as an edge....
Let's look at the graph uncompiled. It's node "_do_expr.1.awaited.then.1.ifTrue.1.do_finally.1". It's got "cont" in the tail, sure enough. It's not translated why? because walk
didn't see the cont
call for some reason. Except it does.
branch name cont
points to "_do_expr.1.awaited.then.1.ifTrue.1.do_finally.2", so it's not translated??? So let's see what munge does with this node.
Sure enough "nms" does not contain "cont". Oh Problem was I forgot to change a reference of edgeProperties$label to localName. I'm just confused how the fuck anything worked like that. (how the fuck did anything work that way?)
It sure would be nice if I had strong typing to check that I was using the same field name that it was created with.
Nope, that didn't help it catch "cont." The edge to "cont" didn't make it into the graph edgeproperties?
Let's turn on verbose for walk again.
OHHHHHHH it's because both targets exit to the same node. In particular, both "cont" and "return" went to the same place.
So I need to index edges by their local name(!) rather than their global name. Or else consolidate the local names. The former requires a bigger refactoring. The local names are unique and each does make for a distinct edge.
But the latter seems trickier.
Before I tackle this, make sure my graph isomorphism check works. I mean I'll have to rewrite it too. But rewriting from a good starting point should work.
How about the rewrite comes with a change of name from "edgeProperties" to "nodeEdges"?
Okay I think that went easier than I was expecting.
I know I need to fix the next things. It has been pretty difficult with the big tryCatch wrapped around everything, to get a stack trace. Do we even need a tryCatch around nextOr? I turned off the "base winding" for generators (but not for asyncs), and no tests failed. Nwo my stack traces are a bit healthier.
In compiled mode this generator screws up:
g <- gen({ tryCatch({ yield(5) stop("foo") yield(6) }, error=identity) yield(7) stop("bar") yield(8) })
Okay when I try to walk this graph I get fucky things? What's this whole do_expr.n? Okay it looks fine actually. Let's enable verbose and see where it breaks.
Okay it's "finished unexpectedly" after pause. Which means it didn't set the state after getting to pause.
> > expect_equal(nextElem(g), 7) generator: nextOr pump: run catch: windup R: stop("foo") catch: catch in windup catch: stop pump: removing from windup list catch: unwind pump: pause *********************************************** Error in obj$nextOr(or, ...) : Generator finished unexpectedly whereas in non-compiled code it did: > > expect_equal(nextElem(g), 7) generator: nextOr pump: run catch: windup R: stop("foo") catch: catch in windup catch: stop pump: removing from windup list catch: unwind pump: continuing after unwind ******************************************** R: identity catch: calling error handler pump: set continue pump: continue R: 7 yield generator: yield pump: set unpause pump: pause
I think this is a case of do_windup
not being translated, printf showing me that. Yeah, do_windup
should be a node but the graph draws it as a variable. windup
is a trampoline that takes two pointers, one to add to the windup list and another to continue.
stop: could not find function "_do_windup.tryCatch#parent.frame"
okay, but then it also resolves the wrong promise.
Okay now we're at "could not find function stop_
.
Also, error text is not being propagated somehow. I guess that's because it's an internal error. But the tryCatch should still get it.
Okay a couple things aren't right when we get to _return
. Here's how do_windup
is being translated:
function (cont, ...) { list(`_do_expr.1.awaited.then.2.ifTrue.1.do_finally.2`, ...) `entry#trace`("finally: windup\n") on.exit(`entry#trace`("finally: unwind\n")) tryCatch(`_do_expr.1.awaited.then.2.ifTrue.1.do_finally.2`(...), error = function(err) { `entry#trace`("finally: catch in windup\n") stop_(err) }) NULL }
surely that should be list(cont, ...), and stop_ is not being translated as a call either. And the function is apparently being invoked as cont(cont=cont)??
So I think the issue is I'm not walking into anonymous functions.
Okay we're back to the wrong "resolve" being called.
Whick means we're passing the wrong handler to register...
So it looks like munged resolve#pump
goes to the old pump.
Okay, when we are walking over a compiled "async" it thinks its "pump" is an external function and it copies it over. I changed it to move_value
which at least checks fora node pointer.
Now, however, my generator is failing to find corresponding local names stop_
in the munged graph. The compiled graph gives it a local name of "nextOr#stop." That's definitely not its local name. nextOr#stop should be translated to "stop" during munge?
So I think this is a consequence of not deduping. Two identical stop_
functions are winding up transferred. Doing a direct move hid this -- nextOr#stop was a leaky pointer so it did not previously show up when scanning for a local name for "stop".
Also, it was because I used "stop" instead of "base::stop" in one spot. So no need to dedupe yet,
And the same thing is happening with "resolve#pump". in the async. Which points to needing to regard tailcalls of lambdas as (potential) tailcalls themselves.
Now there are just five tests left!
> > expect_equal(nextElem(g), 5) generator: nextOr pump: run R: getOption("try.outFile", default = stderr()) R: TRUE catch: begin pump: Adding to windup list catch: windup R: 5 yield generator: yield pump: set unpause catch: unwind pump: pause > nextElem(g) generator: nextOr pump: run catch: windup R: stop("foo") catch: catch in windup catch: stop pump: removing from windup list catch: unwind pump: continuing after unwind R: try_handler catch: calling error handler Error in !silent_ : invalid argument type >
It's "silent_" in tryCatch that is missing a value.
Ah, I think try_handler
leaked scope. Should I fix that in "munge?"
Ohhh. the error handler is outside of "tryCatch"'s scope. Because it's just an "R" and I haven't been translating "R"'s scope either. Fine, duplicate the code.
execCallbacks(timeoutSecs, all, loop$id)
: Evaluation error: object of type 'closure' is not subsettable.This error showed up in async reject. Turns out to have been A name collision of lambda argument (err) with handler.
In this:
async({ tryCatch({ if (await(pr)) { return(5) not_run <<- FALSE 5 } else 4 }, finally={ cleanup <<- TRUE }) })
It should give a more comprehensible error. This particular error sounds like something wrong with tryCatch.
...
but either take a value or not..Okay, this ...
business in the handlers might be getting silly. I think for sanity's sake I might have to declare some trampolines take a val
and others don't. That will be another refactoring (probably before I do codegen)
...
from signal handlersJust use a local varable and another state if you want to return two things.
Would aid debugging comprehension if context / node labels followed the syntax tree more, were like (try.body.1, try.body,3, try.catch.R) and the like. This would also help node names be idempotent. This could also help communicate to the user "where" a particular async is.
I looked at clever ways to pull the tree out of the graph but the thing is that after I construct the graph the original parse tree information is lost. So I decided the best way to do this was assign the context names while I'm parsing, every constructor had a name assigned to it.
The current CPS version does not do synonyms/branch fallthrough. To fix that I
would need to know during construction which arguments are missing. is_R
and R_expr
help with this.
R's native "switch" does a thing where it can end up executing no arguments.
switch(-1, "one", "two", "three") #takes no branch and returns NULL.
That behavior would make it bad to use base R switch as a compiled version of switch, because it would end up falling off and not tailcalling.
Let's see, what about if you initialize a "mapping" at run time and then switch based on a label you pull out of there?
I also need to do something tricky to make my implementation of switch
traversable by walk
. Like bquote() out a custom switch_cps. that has visible graph labels
Okay, now debugging my switch implementation.
It works in interpreted mode but "paranoid" is showing me that the names of "goto" things aren't munged with the same names.
The answer might be to spend some time cleaning up my graphs.
Where are my edge labels? Esp for windup and unwind?
Fixed the edge labels.
Now this is my numeric "goto" test at compileLevel -1. It complains that after munging, two nodes are missing.
Browse[2]> setdiff(names(graph$nodes), unname(cnodes)) [1] ".switch6.goto__goto_" ".switch3.goto__goto_"
And it looks like it's because those nodes were literally identical to other nodes. All the goto nodes are really identical, huh, because they don't know their sources and have been ripped out of their identical environments.
So we really found a way that this paranoid graph isomorphism isn't always true! Can we relax the check and still retain value in it?
Maybe check that reverse edges on the new graph correspond to reverse edges on old?
i.e. if there is an edge from old[A] to old[B], then new[B] should (still) exist and it should be identical() to somthing that new[A]'s exit
Graphviz does not like the new labels (clearly). Spend some time making sure that compiled generator's graphs look okay.
I ditched hashbag and switched to plain envs. Seems faster.
It's difficult to profile recursive algorithms, but I think from looking at profvis output, that "contains" and is_forced are eating a lot of time. I sure wish that counted as "one" invocation. It's more useful to look at the flame graph and notice what's spending a lot of time on top.
The solution for "contains" might be to just assign every node an attribute giving local name, and (perhaps) the result of all.names. Could call it %<-%
and have it memoize on the function text.
I also rewrote all_names
to use a collector callback instead of a big pile of c().
This appears to make less work for the GC!
Made some methods that allow single stepping the R code, as well as to allow single stepping the "interior" code. Seems pretty useful. I might get rid of trace
at some point....
It's doing something different in forGraph=TRUE
than in forGraph=FALSE
which I thought I put a lot of checks around.
It's failing to see the edge Node: .{1.tryCatch__do_windup -> .{1.tryCatch__stop_
which is in a lambda. Why is all_names
missing it.?
Okay, the problem is that the "tryCatch" in catch_cps
's do_windup
is not in tail position.
This was fixed by working on and streamlining all_names.
The problem I'm trying to solve is that there are a bunch of duplicate references to the target environment captured in the graph. If I'm going to be able to reuse a compiled generator, that should just be one ref to the target environment; it's currently being duplicated all around with
Have to stick a bounce back in there. That means a generator has two bounces per loop instead of 1 hmm. Or else we have to bounce on yieldFrom's yield. That might make more sense.
I suppose I'm on a roll so I think Ii'll add this. on.exit handlers will be global to the generator they're running in. Conceptually they will be operating outside of any loop, even if written within, Therefore at construction time, they are passed back to the top-level pump(), using a register_exit
callback, in return for a handle. At run time, the on.exit
CPS function calls a callback add_exit
with that handle to activate the handler (which appends the handle to a list). At the pump level, the on-exit handler will behave like a switch statement looping over the list and calling until it is exhausted.
At the pump level, pump return_
should jump to do_exits before continuing to exit.
You can put a check for doneness into an on-exit clause in pump(). If it exits without pausing, enter a special state, bounce to the on-exit handler and pump() again.
I think I will work on getting that mechanic right first, then worry about switch-case alignment later. Make it work for the interpreted case before trying to add the bquote shenanigans to make the compiler pick it up.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.