Await, and Streams

test failure: can print a name-munged generator

Generator's print method is not finding the "state" variable that nextOr uses, because it was munged to 'nextOr|state'.

Is nextOr part of the munging? really? I guess I think/thought it and pump would have to be namespace-munged.

Okay, I think the answer is to add a "getState" function to the start set. The start set is really "nodes that we have to carry over (whether they're in the graph of not) and have consistent names."

walk(graphc) finds a second context just having "runPump" in it.

This would mean that runPump was not translated? But my thingy says it's translated. It should be moving over with the nodes. Am I tracing that?

But wait, the nodes are moved last! They should move first, perhaps. But wait, then, what thing is my translation finding? It shouldn't find a function there at all?

Mysteries. I changed the order and...

It is still doing it.

> gc <- compile(g, level=-1)
> graph <- walk(g)
> graphc <- walk(gc)
> length(graphc$contexts)
[1] 2
> names(graphc$contexts)
[1] "runPump|" "entry|"

But the log said:

Context:  stop_| 
 Moving nodes:
   Node: `stop_|`$`stop_` -> `stop_`
   Node: `stop_|`$`runPump` -> `runPump`

Static pointer?: `runPump` -> `stop_|runPump`
     with translated reference: `stop_|`$`runPump` -> `runPump`

Weird... yeah, just runPump is failing to be ported?

I have both the pre- and post-compiled graphs;

> (graph$contexts[["stop_|"]][["runPump"]])
function(...) { ... }
<environment: 0x55ed7e7ea790>
> graphc$contexts[["runPump|"]]$pump
function(...) { ... }
<environment: 0x55ed7e7ea790>

Hold on. There's also a "runPump" in the normal context. They are different:

graphc$contexts[['entry|']][["runPump"]]
function (...) 
{
    assert(`stop_|action` == "pause", paste0("pump asked to continue, but last action was ", 
        `stop_|action`))
    `stop_|pumpCont`(...)
    ...
}
<environment: 0x55ed760dfa48>
> graphc$contexts[["runPump|"]]$runPump
function(...) {
    assert(action == "pause",
           paste0("pump asked to continue, but last action was ", action))
    pumpCont(...) # here's where you inject a return value into a yield?
    ...
}
<environment: 0x55ed7e7ea790>

To the translated one went into the right spot while the other one is being picked up as a node? OH! It's getting them from getStartSet somehow?

lapply(getStartSet.generator(gc), environment)
$entry
<environment: 0x55ed760dfa48>
...
$runPump
<environment: 0x55ed7e7ea790>

Okay yeah it's coming from getStartSet. How? A bad variable reference. Was using global g instead of argument x

runPump is being translated as stop|runPump???

Okay, so after munging I have both "runPump" and "stop|runPump" defined. It looks like it is moving the node as well as data. It shouldn't do that. It's doing that because runPump is "read" as well as called.

Who does the reading? Is that in the graph?

> lapply(as.list(graph$contextNodes[[graph$nodeContexts[["runPump"]]]]), function(nodeName) graph$nodeProperties[[nodeName]]$read)
...
$pump
[1] "action"  "nonce"   "runPump" "value"  
...

Okay, it's registering as a var because of the call to doWindup(runPump).

Options for fixing this:

[ ] Since this is in runPump with will not exist when codegen, maybe just mask parts of code from the "walk" algorithm. Maybe make walk() ignore anything inside of an "identity". But we It would also have to mask when doing variable translations.
[ ] Make munge ignore any var that points to a canonical node with the canonical node name? This, though, how would it distinguish a legit pointer? And part of the problem is that this ends up in varTranslations, it would have to be something we do before that.
[ ] Either canonically Ignore doWindup's args during walk,
[x] Make walk treat doWindup as a trampoline call even if it's not the tail.

I did 4. just needed to make it detect trampoline calls even outside the function tail.

Error finding "trace": need to move utility functions.

Well we need trace over there in the destination environment too. Why isn't trace gathered? I need to also carry over "utility" functions, then. This works the same as moving nodes, I guess.

I made it copy over utility functions. Problem: this creates a bunch of actual duplicates of "trace,", bu thtis doesn't stop me right noe.

Bigger Problem: it is copying over "stop" and "pumpCont" as if they are utilities.

Context:  stop_| 
 Moving nodes:
   Node: `stop_|`$`stop_` -> `stop_`
   Node: `stop_|`$`runPump` -> `runPump`
   Node: `stop_|`$`return_` -> `return_`
   Node: `stop_|`$`ret_` -> `_3.do_.body.2.ret`
   Node: `stop_|`$`pump` -> `pump`
   Node: `stop_|`$`pause_` -> `_3.do_.body.1.yield.pause`
 Moving utils:
   Function: `stop_|`$`stop` -> `stop_|stop`
   Function: `stop_|`$`trace` -> `stop_|trace`
   Function: `stop_|`$`pumpCont` -> `stop_|pumpCont`
 Moving data:
   State var: `action` -> `stop_|action`
   State pointer: `pumpCont` -> `stop_|pumpCont`
     with translated reference: `entry|`$`R_` -> `entry`
   State var: `verbose` -> `stop_|verbose`
   State var: `nonce` -> `stop_|nonce`
   State var: `value` -> `stop_|value`

Checking with the graph:

graph$contextProperties[["stop_|"]]
lapply(as.list(graph$contextNodes[["stop_|"]]), \ (node)as.list(graph$nodeProperties[[node]]))

pumpCont is showing up as "read" as well as a state pointer. It's properly a state pointer (and gets translated as such.) So let's ignore utils that also show up as state pointers..? It's still picking up stop as a util though. Why? It's a reference to base "stop". I think that's correct.

Name munging is inappropriately munging `base::stop` to base::`nextOr|stop`

object 'nextOr|stop' not found in runPump. But, there is such a function?? Oh, get_function() shows me the context: base::`nextOr|stop`

because translate() needs to ignore arguments to "::" or ":::", d'oh! I need something more sophisticated than substitute. So i pulled the trick from vadr to translate using an environment...

Sadly, this is noticeably slow compared to using substitute(). And it forces me to use temp vars in a couple spots that I do a double call (f(x)(y)).

basic generator should run without windups

Now that we have iteror() we should go through generator and make sure that it doesn't unnecessarily invoke any tryCatch. This I did incidentally because it exposes the stack traces a little better.

call to `pumpCont` is not being translated

> g <- async:::compile(gen(for (i in 1:10) yield(i)), level=-1)
> nextOr(g, "or")
Error in pumpCont(..., envir = <environment>) : 
  could not find function "pumpCont"

pumpCont was translated where it appears as a variable but not where it appears as a call. Let's break "munge" when it gets to runPump and see what gives.

all_names(graph$nodes[[nodeName]], c("call", "store", "local", "arg", "var", "utility"))

Oh it's happening when I gather the context properties. I was doing

   utils <- setdiff(graph$contextProperties[[contextName]]$utility, contextVars)

Which was incorrect.

Generator is not yielding

With tracing on:

> gc <- async:::compile(gen(for (i in 1:10) yield(i)), level=-1)
> nextOr(g, "or")

R: i
yield
generator: yield
pump: set unpause
pump: pause
Error in obj$nextOr(or, ...) : Generator finished unexpectedly

Do this to debug the next node:

> debug(environment(g$nextOr)$`stop_|pumpCont`)

ls(environment(environment(gc$nextOr)$`stop_|pumpCont`))

(delete pages of fussing about this var not being translated or that one not being covered in walk, etc.)

Okay now seems like the damn entry point stored in pumpCont is pointing out of scope. Ah. it was moved as an util after being moved as a data (among several other things)

What if you were just more dumb about this?

I.e. you had each context return expressions for each of its states, and a list of names to be translated.

Most of the mechanics of substituting inlining, etc. would be the same, it's the "scanning things and automatically tagging vars" that would be skipped. So it seems like I'm far enough along path A that I wouldn't gain a whole lot trying to feel out path B just yet, though the idea of generating the dumb part automatically is appealing, but I can also throw memoization into the mix.

And just like that, I've had a nearly successful generator run after munging.

Graph failing with `missing value where TRUE/FALSE needed`

What's up with my graph? edgeProperties is giving me a "type: NULL" on some edge.

Oof, it's that thing where "c"/unlist adds numbers to names again. I've been trying to use "names" of a vector to track classes of a thing and there's some friction.

Exclude state pointers when determining a local name.

Nodes should not be named pumpCont for instance.

Repeat all tests with compiled

Set up a global compiler option, Then repeat all generator tests with compilation off and on. I did this with symlinks... is there a better way? Doing source or test-file inside of a test file tended to have weird namespace errors.

`for` loop over an iterator fails

asyncOpts(compileLevel=-1, verbose=TRUE)
x <- gen(for (i in icount()) {yield(i)})
nextElem(x)

i returns a function, somehow. It's returning the iterator, not invoking it. It didn't recognize?

debug(environment(x$nextOr)$`_1.do_.body.1.yield`)

Ah, because the new environment is not enclosed in async, method is not finding methods I haven't exported. So for_cps calls iteror which was going to iteror.default, because I hadn't @exported iteror.iteror <- identity.

munging asyncs.

Looks like munge.generator is pulling in resolve and reject (the callbacks the promise ctor gives us). For compilation we want to swap in a new promise there, though. I changed resolve_ and reject_ so these are no longer tail calls.

`stop` is both a variable and a call

lapply(names(graph$contextNodes[[contextName]]), \(x)as.list(graph$nodeProperties[[x]]))

Turns out I was doing a tryCatch({}, error=stop) i.e. passing stop as a pointer; rewriting as tryCatch({}, function(e) stop(e) avoids confusing walk this way..

making async compile

I needed some way to attach the new resolve_ and reject_ pointers. I put an accessor in the start set.

And on to the inevitable failures;

compiled async can't find pumpCont

pump: run
pump: windup
pump: caught error by windup
pump: stop: could not find function "pump#pumpCont"
async: stop (rejecting)
pump: unwind
pump: pause
> Unhandled promise error: could not find function "pump#pumpCont"

On inspection pump#pumpCont is bound to NULL. Whereas in the pre-compiled generator

environment(a$state$pump)$pumpCont

If I turn on verbose, munge claims it was moved. What the hey? Okay, so the node is being moved after the state pointer. I filled the state pointer with a lazy value to work around that.

That moved the ball, now we have

Unhandled promise error: object 'key' not found

Ah, this is because delayedAssign quotes its second argument. Lucky I wrote a whole package that will help me here!

conditionMessage

Now the ball has moved to

no applicable method for 'conditionMessage' applied to an object of class "function"

This was due to name collision between argument err and state variable err. Don't name arguments the same as state variables you use, is the immediate lesson.

Async executes but resolves with NULL

The compiled promise is not resolving; it's resolving the original promise, somehow.

I think the issue is that "utility" functions need to be translated IF they are closed in context.

But now I'm dealing with utility functions that are closed in other contexts.?

Wait, why is resolve not a node? What uses it? It's passed to make_pump as return.

Oh. When I pass things to pump(return= and pump(stop=) they should be enumerated! Are they? They are now. But yet, it's hitting the wrong pump?

I should break in the "await" and see what callbacks it's registering. So the _1#await callback is in a different environment. Hence

Okay yes so the await_ utility function should just be a trampoline, that way it gets translated, and I can pass an anon function to it.

Now after a bit more fussing I have a compiled async that did something! Still some errors though.

Perform a check for well-formedness of generator on all generators during unit test.

I turned "paranoid" on for tests at compile level -1. It checks that everything really got moved over!. I turned up ONE problem in async.

Walking compiled try/finally

fails to walk() a compiled async containing try/finally upon finding a call to finally's cont that wasn't translated. It's counted as a tailcall in the compiled graph;.

Sure enough it's not picked up as an edge....

Let's look at the graph uncompiled. It's node "_do_expr.1.awaited.then.1.ifTrue.1.do_finally.1". It's got "cont" in the tail, sure enough. It's not translated why? because walk didn't see the cont call for some reason. Except it does.

branch name cont points to "_do_expr.1.awaited.then.1.ifTrue.1.do_finally.2", so it's not translated??? So let's see what munge does with this node.

Sure enough "nms" does not contain "cont". Oh Problem was I forgot to change a reference of edgeProperties$label to localName. I'm just confused how the fuck anything worked like that. (how the fuck did anything work that way?)

It sure would be nice if I had strong typing to check that I was using the same field name that it was created with.

Nope, that didn't help it catch "cont." The edge to "cont" didn't make it into the graph edgeproperties?

Let's turn on verbose for walk again.

OHHHHHHH it's because both targets exit to the same node. In particular, both "cont" and "return" went to the same place.

So I need to index edges by their local name(!) rather than their global name. Or else consolidate the local names. The former requires a bigger refactoring. The local names are unique and each does make for a distinct edge.

But the latter seems trickier.

Before I tackle this, make sure my graph isomorphism check works. I mean I'll have to rewrite it too. But rewriting from a good starting point should work.

How about the rewrite comes with a change of name from "edgeProperties" to "nodeEdges"?

Okay I think that went easier than I was expecting.

does generator need to catch?

I know I need to fix the next things. It has been pretty difficult with the big tryCatch wrapped around everything, to get a stack trace. Do we even need a tryCatch around nextOr? I turned off the "base winding" for generators (but not for asyncs), and no tests failed. Nwo my stack traces are a bit healthier.

tryCatch is screwing something up

In compiled mode this generator screws up:

  g <- gen({
    tryCatch({
      yield(5)
      stop("foo")
      yield(6)
    }, error=identity)
    yield(7)
    stop("bar")
    yield(8)
  })

Okay when I try to walk this graph I get fucky things? What's this whole do_expr.n? Okay it looks fine actually. Let's enable verbose and see where it breaks.

Okay it's "finished unexpectedly" after pause. Which means it didn't set the state after getting to pause.

> > expect_equal(nextElem(g), 7)
generator: nextOr
pump: run
catch: windup
R: stop("foo")
catch: catch in windup
catch: stop
pump: removing from windup list
catch: unwind
pump: pause ***********************************************
Error in obj$nextOr(or, ...) : Generator finished unexpectedly

whereas in non-compiled code it did:

> > expect_equal(nextElem(g), 7)
generator: nextOr
pump: run
catch: windup
R: stop("foo")
catch: catch in windup
catch: stop
pump: removing from windup list
catch: unwind
pump: continuing after unwind ********************************************
R: identity
catch: calling error handler
pump: set continue
pump: continue
R: 7
yield
generator: yield
pump: set unpause
pump: pause

I think this is a case of do_windup not being translated, printf showing me that. Yeah, do_windup should be a node but the graph draws it as a variable. windup is a trampoline that takes two pointers, one to add to the windup list and another to continue.

stop: could not find function "_do_windup.tryCatch#parent.frame"

okay, but then it also resolves the wrong promise.

Okay now we're at "could not find function stop_.

Also, error text is not being propagated somehow. I guess that's because it's an internal error. But the tryCatch should still get it.

Okay a couple things aren't right when we get to _return. Here's how do_windup is being translated:

function (cont, ...)
{
    list(`_do_expr.1.awaited.then.2.ifTrue.1.do_finally.2`, ...)
    `entry#trace`("finally: windup\n")
    on.exit(`entry#trace`("finally: unwind\n"))
    tryCatch(`_do_expr.1.awaited.then.2.ifTrue.1.do_finally.2`(...),
        error = function(err) {
            `entry#trace`("finally: catch in windup\n")
            stop_(err)
        })
    NULL
}

surely that should be list(cont, ...), and stop_ is not being translated as a call either. And the function is apparently being invoked as cont(cont=cont)??

So I think the issue is I'm not walking into anonymous functions.

Okay we're back to the wrong "resolve" being called.

Whick means we're passing the wrong handler to register...

So it looks like munged resolve#pump goes to the old pump.

Okay, when we are walking over a compiled "async" it thinks its "pump" is an external function and it copies it over. I changed it to move_value which at least checks fora node pointer.

Now, however, my generator is failing to find corresponding local names stop_ in the munged graph. The compiled graph gives it a local name of "nextOr#stop." That's definitely not its local name. nextOr#stop should be translated to "stop" during munge?

So I think this is a consequence of not deduping. Two identical stop_ functions are winding up transferred. Doing a direct move hid this -- nextOr#stop was a leaky pointer so it did not previously show up when scanning for a local name for "stop".

Also, it was because I used "stop" instead of "base::stop" in one spot. So no need to dedupe yet,

And the same thing is happening with "resolve#pump". in the async. Which points to needing to regard tailcalls of lambdas as (potential) tailcalls themselves.

Now there are just five tests left!

Error in !silent_: invlid argument type.

> > expect_equal(nextElem(g), 5)
generator: nextOr
pump: run
R: getOption("try.outFile", default = stderr())
R: TRUE
catch: begin
pump: Adding to windup list
catch: windup
R: 5
yield
generator: yield
pump: set unpause
catch: unwind
pump: pause
> nextElem(g)
generator: nextOr
pump: run
catch: windup
R: stop("foo")
catch: catch in windup
catch: stop
pump: removing from windup list
catch: unwind
pump: continuing after unwind
R: try_handler
catch: calling error handler
Error in !silent_ : invalid argument type
>

It's "silent_" in tryCatch that is missing a value.

Ah, I think try_handler leaked scope. Should I fix that in "munge?"

Ohhh. the error handler is outside of "tryCatch"'s scope. Because it's just an "R" and I haven't been translating "R"'s scope either. Fine, duplicate the code.

Error in `execCallbacks(timeoutSecs, all, loop$id)`: Evaluation error: object of type 'closure' is not subsettable.

This error showed up in async reject. Turns out to have been A name collision of lambda argument (err) with handler.

Awaiting on a "pr" that doesn't exist gives a "subscript out of bounds"

In this:

  async({
    tryCatch({
      if (await(pr)) {
        return(5)
        not_run <<- FALSE
        5
      } else 4
    }, finally={
      cleanup <<- TRUE
    })
  })

It should give a more comprehensible error. This particular error sounds like something wrong with tryCatch.

Refactor so that trampoline handlers do not use `...` but either take a value or not..

Okay, this ... business in the handlers might be getting silly. I think for sanity's sake I might have to declare some trampolines take a val and others don't. That will be another refactoring (probably before I do codegen)

Remove `...` from signal handlers

Just use a local varable and another state if you want to return two things.

better context/node labels

Would aid debugging comprehension if context / node labels followed the syntax tree more, were like (try.body.1, try.body,3, try.catch.R) and the like. This would also help node names be idempotent. This could also help communicate to the user "where" a particular async is.

I looked at clever ways to pull the tree out of the graph but the thing is that after I construct the graph the original parse tree information is lost. So I decided the best way to do this was assign the context names while I'm parsing, every constructor had a name assigned to it.

Fixing "switch"

The current CPS version does not do synonyms/branch fallthrough. To fix that I would need to know during construction which arguments are missing. is_R and R_expr help with this.

R's native "switch" does a thing where it can end up executing no arguments.

switch(-1, "one", "two", "three") #takes no branch and returns NULL.

That behavior would make it bad to use base R switch as a compiled version of switch, because it would end up falling off and not tailcalling.

Let's see, what about if you initialize a "mapping" at run time and then switch based on a label you pull out of there?

I also need to do something tricky to make my implementation of switch traversable by walk. Like bquote() out a custom switch_cps. that has visible graph labels

Okay, now debugging my switch implementation.

It works in interpreted mode but "paranoid" is showing me that the names of "goto" things aren't munged with the same names.

The answer might be to spend some time cleaning up my graphs.

Where are my edge labels? Esp for windup and unwind?

Fixed the edge labels.

Now this is my numeric "goto" test at compileLevel -1. It complains that after munging, two nodes are missing.

 Browse[2]> setdiff(names(graph$nodes), unname(cnodes))
 [1] ".switch6.goto__goto_" ".switch3.goto__goto_"

And it looks like it's because those nodes were literally identical to other nodes. All the goto nodes are really identical, huh, because they don't know their sources and have been ripped out of their identical environments.

So we really found a way that this paranoid graph isomorphism isn't always true! Can we relax the check and still retain value in it?

Maybe check that reverse edges on the new graph correspond to reverse edges on old?

i.e. if there is an edge from old[A] to old[B], then new[B] should (still) exist and it should be identical() to somthing that new[A]'s exit

Fixing graphviz output on compiled generators

Graphviz does not like the new labels (clearly). Spend some time making sure that compiled generator's graphs look okay.

Improve speed of collecting graph

I ditched hashbag and switched to plain envs. Seems faster.

It's difficult to profile recursive algorithms, but I think from looking at profvis output, that "contains" and is_forced are eating a lot of time. I sure wish that counted as "one" invocation. It's more useful to look at the flame graph and notice what's spending a lot of time on top.

The solution for "contains" might be to just assign every node an attribute giving local name, and (perhaps) the result of all.names. Could call it %<-% and have it memoize on the function text.

I also rewrote all_names to use a collector callback instead of a big pile of c(). This appears to make less work for the GC!

Debugging tools

Made some methods that allow single stepping the R code, as well as to allow single stepping the "interior" code. Seems pretty useful. I might get rid of trace at some point....

Graph is not getting past a "try"

It's doing something different in forGraph=TRUE than in forGraph=FALSE which I thought I put a lot of checks around.

It's failing to see the edge Node: .{1.tryCatch__do_windup -> .{1.tryCatch__stop_

which is in a lambda. Why is all_names missing it.? Okay, the problem is that the "tryCatch" in catch_cps's do_windup is not in tail position.

This was fixed by working on and streamlining all_names.

move eval_ to a handler

The problem I'm trying to solve is that there are a bunch of duplicate references to the target environment captured in the graph. If I'm going to be able to reuse a compiled generator, that should just be one ref to the target environment; it's currently being duplicated all around with

yieldFrom under "run" winds up the stack

Have to stick a bounce back in there. That means a generator has two bounces per loop instead of 1 hmm. Or else we have to bounce on yieldFrom's yield. That might make more sense.

on.exit

I suppose I'm on a roll so I think Ii'll add this. on.exit handlers will be global to the generator they're running in. Conceptually they will be operating outside of any loop, even if written within, Therefore at construction time, they are passed back to the top-level pump(), using a register_exit callback, in return for a handle. At run time, the on.exit CPS function calls a callback add_exit with that handle to activate the handler (which appends the handle to a list). At the pump level, the on-exit handler will behave like a switch statement looping over the list and calling until it is exhausted.

At the pump level, pump return_ should jump to do_exits before continuing to exit.

You can put a check for doneness into an on-exit clause in pump(). If it exits without pausing, enter a special state, bounce to the on-exit handler and pump() again.

I think I will work on getting that mechanic right first, then worry about switch-case alignment later. Make it work for the interpreted case before trying to add the bquote shenanigans to make the compiler pick it up.

crowding/generators documentation built on June 28, 2023, 6:14 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

crowding/generators Coroutines: Generators / Yield, Async / Await, and Streams