A quosure is a special type of [defused expression][topic-defuse] that keeps track of the original context the expression was written in. The tracking capabilities of quosures is important when interfacing [data-masking][topic-data-mask] functions together because the functions might come from two unrelated environments, like two different packages.
Let's take an example where the R user calls the function summarise_bmi()
from the foo package to summarise a data frame with statistics of a BMI value. Because the height
variable of their data frame is not in metres, they use a custom function div100()
to rescale the column.
# Global environment of user div100 <- function(x) { x / 100 } dplyr::starwars %>% foo::summarise_bmi(mass, div100(height))
The summarise_bmi()
function is a data-masking function defined in the namespace of the foo package which looks like this:
# Namespace of package foo bmi <- function(mass, height) { mass / height^2 } summarise_bmi <- function(data, mass, height) { data %>% bar::summarise_stats(bmi({{ mass }}, {{ height }})) }
The foo package uses the custom function bmi()
to perform a computation on two vectors. It interfaces with summarise_stats()
defined in bar, another package whose namespace looks like this:
# Namespace of package bar check_numeric <- function(x) { stopifnot(is.numeric(x)) x } summarise_stats <- function(data, var) { data %>% dplyr::transmute( var = check_numeric({{ var }}) ) %>% dplyr::summarise( mean = mean(var, na.rm = TRUE), sd = sd(var, na.rm = TRUE) ) }
Again the package bar uses a custom function, check_numeric()
, to validate its input. It also interfaces with data-masking functions from dplyr (using the [define-a-constant][topic-double-evaluation] trick to avoid issues of double evaluation).
There are three data-masking functions simultaneously interfacing in this snippet:
At the bottom, dplyr::transmute()
takes a data-masked input, and creates a data frame of a single column named var
.
Before this, bar::summarise_stats()
takes a data-masked input inside dplyr::transmute()
and checks it is numeric.
And first of all, foo::summarise_bmi()
takes two data-masked inputs inside bar::summarise_stats()
and transforms them to a single BMI value.
There is a fourth context, the global environment where summarise_bmi()
is called with two columns defined in a data frame, one of which is transformed on the fly with the user function div100()
.
All of these contexts (except to some extent the global environment) contain functions that are private and invisible to foreign functions. Yet, the final expanded data-masked expression that is evaluated down the line looks like this (with caret characters indicating the quosure boundaries):
dplyr::transmute( var = ^check_numeric(^bmi(^mass, ^div100(height))) )
The role of quosures is to let R know that check_numeric()
should be found in the bar package, bmi()
in the foo package, and div100()
in the global environment.
As a tidyverse user you generally don't need to worry about quosures because {{
and ...
will create them for you. Introductory texts like Programming with dplyr or the [standard data-mask programming patterns][topic-data-mask-programming] don't even mention the term. In more complex cases you might need to create quosures with [enquo()] or [enquos()] (even though you generally don't need to know or care that these functions return quosures). In this section, we explore when quosures are necessary in these more advanced applications.
As a rule of thumb, quosures are only needed for arguments defused with [enquo()] or [enquos()] (or with r link("{{")
which calls enquo()
implicitly):
my_function <- function(var) { var <- enquo(var) their_function(!!var) } # Equivalently my_function <- function(var) { their_function({{ var }}) }
Wrapping defused arguments in quosures is needed because expressions supplied as argument comes from a different environment, the environment of your user. For local expressions created in your function, you generally don't need to create quosures:
my_mean <- function(data, var) { # `expr()` is sufficient, no need for `quo()` expr <- expr(mean({{ var }})) dplyr::summarise(data, !!expr) } my_mean(mtcars, cyl)
Using [quo()] instead of [expr()] would have worked too but it is superfluous because dplyr::summarise()
, which uses [enquos()], is already in charge of wrapping your expression within a quosure scoped in your environment.
The same applies if you evaluate manually. By default, [eval()] and [eval_tidy()] capture your environment:
my_mean <- function(data, var) { expr <- expr(mean({{ var }})) eval_tidy(expr, data) } my_mean(mtcars, cyl)
An exception to this rule of thumb (wrap foreign expressions in quosures, not your own expressions) arises when your function takes multiple expressions in a list instead of ...
. The preferred approach in that case is to take a tidy selection so that users can combine multiple columns using c()
. If that is not possible, you can take a list of externally defused expressions:
my_group_by <- function(data, vars) { stopifnot(is_quosures(vars)) data %>% dplyr::group_by(!!!vars) } mtcars %>% my_group_by(dplyr::vars(cyl, am))
In this pattern, dplyr::vars()
defuses expressions externally. It creates a list of quosures because the expressions are passed around from function to function like regular arguments. In fact, dplyr::vars()
and ggplot2::vars()
are simple aliases of [quos()].
dplyr::vars(cyl, am) #> <list_of<quosure>> #> #> [[1]] #> <quosure> #> expr: ^cyl #> env: global #> #> [[2]] #> <quosure> #> expr: ^am #> env: global
For more information about external defusing, see r link("topic_multiple_columns")
.
A quosure carries two things:
And implements these behaviours:
For historical reasons, [base::eval()] doesn't support quosure evaluation. Quosures currently require [eval_tidy()]. We would like to fix this limitation in the future.
It is hygienic. It evaluates in the tracked environment.
It is maskable. If evaluated in a data mask (currently only masks created with [eval_tidy()] or [new_data_mask()]), the mask comes first in scope before the quosure environment.
Conceptually, a quosure inherits from two chains of environments, the data mask and the user environment. In practice rlang implements this special scoping by rechaining the top of the data mask to the quosure environment currently under evaluation.
There are similarities between promises (the ones R uses to implement lazy evaluation, not the async expressions from the promises package) and quosures. One important difference is that promises are only evaluated once and cache the result for subsequent evaluation. Quosures behave more like calls and can be evaluated repeatedly, potentially in a different data mask. This property is useful to implement split-apply-combine evaluations.
[enquo()] and [enquos()] to defuse function arguments as quosures. This is the main way quosures are created.
[quo()] which is like [expr()] but wraps in a quosure. Usually it is not needed to wrap local expressions yourself.
[quo_get_expr()] and [quo_get_env()] to access quosure components.
[new_quosure()] and [as_quosure()] to assemble a quosure from components.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.