A quosure is a special type of [defused expression][topic-defuse] that keeps track of the original context the expression was written in. The tracking capabilities of quosures is important when interfacing [data-masking][topic-data-mask] functions together because the functions might come from two unrelated environments, like two different packages.

Blending environments

Let's take an example where the R user calls the function summarise_bmi() from the foo package to summarise a data frame with statistics of a BMI value. Because the height variable of their data frame is not in metres, they use a custom function div100() to rescale the column.

# Global environment of user

div100 <- function(x) {
  x / 100
}

dplyr::starwars %>%
  foo::summarise_bmi(mass, div100(height))

The summarise_bmi() function is a data-masking function defined in the namespace of the foo package which looks like this:

# Namespace of package foo

bmi <- function(mass, height) {
  mass / height^2
}

summarise_bmi <- function(data, mass, height) {
  data %>%
    bar::summarise_stats(bmi({{ mass }}, {{ height }}))
}

The foo package uses the custom function bmi() to perform a computation on two vectors. It interfaces with summarise_stats() defined in bar, another package whose namespace looks like this:

# Namespace of package bar

check_numeric <- function(x) {
  stopifnot(is.numeric(x))
  x
}

summarise_stats <- function(data, var) {
  data %>%
    dplyr::transmute(
      var = check_numeric({{ var }})
    ) %>%
    dplyr::summarise(
      mean = mean(var, na.rm = TRUE),
      sd = sd(var, na.rm = TRUE)
    )
}

Again the package bar uses a custom function, check_numeric(), to validate its input. It also interfaces with data-masking functions from dplyr (using the [define-a-constant][topic-double-evaluation] trick to avoid issues of double evaluation).

There are three data-masking functions simultaneously interfacing in this snippet:

There is a fourth context, the global environment where summarise_bmi() is called with two columns defined in a data frame, one of which is transformed on the fly with the user function div100().

All of these contexts (except to some extent the global environment) contain functions that are private and invisible to foreign functions. Yet, the final expanded data-masked expression that is evaluated down the line looks like this (with caret characters indicating the quosure boundaries):

dplyr::transmute(
  var = ^check_numeric(^bmi(^mass, ^div100(height)))
)

The role of quosures is to let R know that check_numeric() should be found in the bar package, bmi() in the foo package, and div100() in the global environment.

When should I create quosures?

As a tidyverse user you generally don't need to worry about quosures because {{ and ... will create them for you. Introductory texts like Programming with dplyr or the [standard data-mask programming patterns][topic-data-mask-programming] don't even mention the term. In more complex cases you might need to create quosures with [enquo()] or [enquos()] (even though you generally don't need to know or care that these functions return quosures). In this section, we explore when quosures are necessary in these more advanced applications.

Foreign and local expressions

As a rule of thumb, quosures are only needed for arguments defused with [enquo()] or [enquos()] (or with r link("{{") which calls enquo() implicitly):

my_function <- function(var) {
  var <- enquo(var)
  their_function(!!var)
}

# Equivalently
my_function <- function(var) {
  their_function({{ var }})
}

Wrapping defused arguments in quosures is needed because expressions supplied as argument comes from a different environment, the environment of your user. For local expressions created in your function, you generally don't need to create quosures:

my_mean <- function(data, var) {
  # `expr()` is sufficient, no need for `quo()`
  expr <- expr(mean({{ var }}))
  dplyr::summarise(data, !!expr)
}

my_mean(mtcars, cyl)

Using [quo()] instead of [expr()] would have worked too but it is superfluous because dplyr::summarise(), which uses [enquos()], is already in charge of wrapping your expression within a quosure scoped in your environment.

The same applies if you evaluate manually. By default, [eval()] and [eval_tidy()] capture your environment:

my_mean <- function(data, var) {
  expr <- expr(mean({{ var }}))
  eval_tidy(expr, data)
}

my_mean(mtcars, cyl)

External defusing

An exception to this rule of thumb (wrap foreign expressions in quosures, not your own expressions) arises when your function takes multiple expressions in a list instead of .... The preferred approach in that case is to take a tidy selection so that users can combine multiple columns using c(). If that is not possible, you can take a list of externally defused expressions:

my_group_by <- function(data, vars) {
  stopifnot(is_quosures(vars))
  data %>% dplyr::group_by(!!!vars)
}

mtcars %>% my_group_by(dplyr::vars(cyl, am))

In this pattern, dplyr::vars() defuses expressions externally. It creates a list of quosures because the expressions are passed around from function to function like regular arguments. In fact, dplyr::vars() and ggplot2::vars() are simple aliases of [quos()].

dplyr::vars(cyl, am)
#> <list_of<quosure>>
#> 
#> [[1]]
#> <quosure>
#> expr: ^cyl
#> env:  global
#> 
#> [[2]]
#> <quosure>
#> expr: ^am
#> env:  global

For more information about external defusing, see r link("topic_multiple_columns").

Technical description of quosures

A quosure carries two things:

And implements these behaviours:

For historical reasons, [base::eval()] doesn't support quosure evaluation. Quosures currently require [eval_tidy()]. We would like to fix this limitation in the future.

Conceptually, a quosure inherits from two chains of environments, the data mask and the user environment. In practice rlang implements this special scoping by rechaining the top of the data mask to the quosure environment currently under evaluation.

There are similarities between promises (the ones R uses to implement lazy evaluation, not the async expressions from the promises package) and quosures. One important difference is that promises are only evaluated once and cache the result for subsequent evaluation. Quosures behave more like calls and can be evaluated repeatedly, potentially in a different data mask. This property is useful to implement split-apply-combine evaluations.

See also



Try the rlang package in your browser

Any scripts or data that you put into this service are public.

rlang documentation built on Nov. 4, 2023, 9:06 a.m.