Wrapping Nonstandard-evaluating functions

Or, how to use Metaprogramming to work around other people's code that used Metaprogramming

Automating graph labels.

Probably the most frequent and mundane application of non-standard evaluation is to automate writing axis labels. Here is code that plots a sine curve:

x <- seq(0, 2 * pi, length = 100)
plot(x, sin(x), type = "l")

The graph that is produced has labels on its horizontal axis, sin(x) and x on the vertical axis.

If, instead, we stored sin(x) in a variable y, a different label is drawn on the Y axis.

y <- sin(x)
plot(x, y, type = "l")

So plotting y <- sin(x) does not mean that wrting y is equivalent to writing sin(x). plot is still able to tell the difference.

In standard evaluation a function only depends on the value of its arguments. If y is defined to have the value of sin(x), then f(y) should do the same thing as f(sin(x)). Clearly plot breaks this rule. The R community has calls this non-standard evaluation. A function that uses non-standard evalution can have behavtior depends on the way its argument is written, not just what values they evaluate to.

NSE makes it difficult to wrap functions.

Let's say we wanted to make a function lineplot that behaves just like plot but has a default plot argument.

I want to make that a hard rule. Writing lineplot(x, sin(y)) should function exactly as writing `plot(x, sin(y), sin(x))

lineplot <- function(x, y, type="l", ...) {
  plot(x, y, type, ...)
}

lineplot(x, sin(x))

But our axis label is now wrong. It now always says x and y instead of sin(y) or whatever else we might have wrote. lineplot didn't live up to the spec of behaving just like plot but for one detail.

The NSE package easily wraps NSE functions with invoke

It's harder to fix this problem than you might think, so I'll first show nse's way of writing lineplot:

lineplot <- function(x, y, type="l", ...) {
  invoke(plot)(x, y, type=type, ...)
}

lineplot(x, sin(x))

Here's how it works. invoke(plot) returns a function. That function takes its arguments by name instead of evaluating them.

R's builtin facilities have weird edge cases

Here are various ways base R provides of writing lineplot that all don't quite meet the spec.


Counterargument: do(plot(x=x, y=y)

How to wrap "plot"

How to wrap an NSE function in a higher order function

match.call vs. default arguments

match.call vs. ...

parent.frame() vs. default arguments.

pass data to an indirect function, without blowing up the stacktrace

EXAMPLE

Using arg_env instead of parent_frame also allows non-standard-evaluating functions to be composable. (example with dlply and glm functions)

caller and with_caller

In writing NSE functions you should as much as possible use arg_env to determine where arguments come from. First convert all uses of parent.frame() into arg_env(). Where a function is called from is another thing, called caller(). This is actually a bit different from what parent.frame() does. caller() guards against giving surprising (wrong) results in situations involving lazy evaluation or when the caller can no longer be determined -- it prefers to throw errors instead.

Here is a practical example of why.

Argument lists and ...

In R, to write a function that takes any number of arguments, we use the symbol ..., which represents a sort of list of arguments. This is somewhat similar to Python's solution of *args, **kwargs, except that while in Python variadic arguments are stored in first-class data structures, in R the ... is opaque. It's tricky to do things like take the second argument, or take all the odd arguments, or obtain a list of argument names without forcing them, or concatenating two argument lists together. The only thing you can do with a ... when you have it is to pass them all to another function's argument list.

But under the hood, a ... is just a named list of promises. The nse package provides a dots class and accessor functions that let you manipulate sequences of promises the same way as you would manipulate other kinds of lists.

As an example, consider trying to implement the R library function switch. This takes any number of arguments, and uses the argument expr -- either a number or a name -- to decide which of the other arguments to return, only forcing that argument. But in base R it is difficult to access unevaluated arguments by index or name, so switch has a C implementation.

It's actually possible to implement switch in base R, but... coming up with a solution requires a bit of tinkering. I can do it by synthesizing a function with the right arglist to match the required argument. The weirdest part was working out how names like ..3 work.

switch2 <- function(expr, ...) {
  n <- names(substitute(list(...)))[-1]
  if (!is.null(n))
      arglist <- as.pairlist(structure(
          rep(list(quote(expr=)), length(n)),
          names=n))
  else
      (arglist <- as.pairlist(alist(...=)))

  if (is.numeric(expr))
      body <- as.name(paste0("..", expr))
  else
      body <- as.name(expr)
  f <- eval(substitute(`function`(arglist, body),
                         list(arglist=arglist, body=body)))
  f(...)
}

The point of this digression is that with a direct interface to manipulate argument lists, switch is easy:

switch3 <- function(expr, ...) {
  force_(dots(...)[[expr]])
}

Missing arguments.

When a function defined as having an argument but that argument is not given, that argument is "missing."

For example,

f <- function(x) missing(x)
f(1) ##FALSE
f() ## X


crowding/nse documentation built on Jan. 5, 2024, 12:14 a.m.