NEWS.md

tidyselect (development version)

tidyselect 1.2.1

tidyselect 1.2.0

New features

Lifecycle changes

Minor improvements and bug fixes

tidyselect 1.1.2

tidyselect 1.1.1

tidyselect 1.1.0

{r} iris %>% select(where(is.factor))

We made this change to avoid puzzling error messages when a variable is unexpectedly missing from the data frame and there is a corresponding function in the environment:

{r} # Attempts to invoke `data()` function data.frame(x = 1) %>% select(data)

Now tidyselect will correctly complain about a missing variable rather than trying to invoke a function.

For compatibility we will support predicate functions starting with is for 1 version.

tidyselect 1.0.0

This is the 1.0.0 release of tidyselect. It features a more solidly defined and implemented syntax, support for predicate functions, new boolean operators, and much more.

Documentation

Breaking changes

tidyselect now uses vctrs for validating inputs. These changes may reveal programming errors that were previously silent. They may also cause failures if your unit tests make faulty assumptions about the content of error messages created in tidyselect:

Note that we recommend testthat::verify_output() for monitoring error messages thrown from packages that you don't control. Unlike expect_error(), verify_output() does not cause CMD check failures when error messages have changed. See https://www.tidyverse.org/blog/2019/11/testthat-2-3-0/ for more information.

Syntax

These patterns can currently be achieved using -, c() and intersect() respectively. The boolean operators should be more intuitive to use.

Many thanks to Irene Steves (@isteves) for suggesting this UI.

r iris %>% select(is.factor) iris %>% select(is.factor | is.numeric)

This feature is not available in functions that use the legacy interface of tidyselect. These need to be updated to use the new eval_select() function instead of vars_select().

r data %>% select(1:ncol(data)) data %>% pivot_longer(1:ncol(data))

Even if the data frame data contains a column also named data, the subexpression ncol(data) is still correctly evaluated. The data:ncol(data) expression is equivalent to 2:3 because data is looked up in the relevant context without ambiguity:

r data <- tibble(foo = 1, data = 2, bar = 3) data %>% dplyr::select(data:ncol(data)) #> # A tibble: 1 x 2 #> data bar #> <dbl> <dbl> #> 1 2 3

While this example above is a bit contrived, there are many realistic cases where these changes make it easier to write safe code:

{r} select_from <- function(data, var) { data %>% dplyr::select({{ var }} : ncol(data)) } data %>% select_from(data) #> # A tibble: 1 x 2 #> data bar #> <dbl> <dbl> #> 1 2 3

User-facing improvements

vars <- c("Species", "Genus") iris %>% dplyr::select(-any_of(vars))

Note that all_of() and any_of() are a bit more conservative in their function signature than one_of(): they do not accept dots. The equivalent of one_of("a", "b") is all_of(c("a", "b")).

```r # Before dplyr::select(mtcars, dplyr::starts_with("c"))

# After dplyr::select(mtcars, starts_with("c")) ```

It is still recommended to export the helpers from your package so that users can easily look up the documentation with ?.

{r} starts_with(c("a", "b")) starts_with("a") | starts_with("b")

API

New eval_select() and eval_rename() functions for client packages. These replace vars_select() and vars_rename(), which are now deprecated. These functions:

Other features and fixes

tidyselect 0.2.5

This is a maintenance release for compatibility with rlang 0.3.0.

tidyselect 0.2.4

tidyselect 0.2.3

tidyselect 0.2.2

tidyselect 0.2.1

{r} vars <- c("cyl", "am", "disp", "drat") vars_select(names(mtcars), - !!vars)

tidyselect 0.2.0

The main point of this release is to revert a troublesome behaviour introduced in tidyselect 0.1.0. It also includes a few features.

Evaluation rules

The special evaluation semantics for selection have been changed back to the old behaviour because the new rules were causing too much trouble and confusion. From now on data expressions (symbols and calls to : and c()) can refer to both registered variables and to objects from the context.

However the semantics for context expressions (any calls other than to : and c()) remain the same. Those expressions are evaluated in the context only and cannot refer to registered variables.

If you're writing functions and refer to contextual objects, it is still a good idea to avoid data expressions. Since registered variables are change as a function of user input and you never know if your local objects might be shadowed by a variable. Consider:

n <- 2
vars_select(letters, 1:n)

Should that select up to the second element of letters or up to the 14th? Since the variables have precedence in a data expression, this will select the 14 first letters. This can be made more robust by turning the data expression into a context expression:

vars_select(letters, seq(1, n))

You can also use quasiquotation since unquoted arguments are guaranteed to be evaluated without any user data in scope. While equivalent because of the special rules for context expressions, this may be clearer to the reader accustomed to tidy eval:

vars_select(letters, seq(1, !! n))

Finally, you may want to be more explicit in the opposite direction. If you expect a variable to be found in the data but not in the context, you can use the .data pronoun:

vars_select(names(mtcars), .data$cyl : .data$drat)

New features

Fixes

tidyselect 0.1.1

tidyselect is the new home for the legacy functions dplyr::select_vars(), dplyr::rename_vars() and dplyr::select_var().

API changes

We took this opportunity to make a few changes to the API:

vars_select(vars, !!! quos(...))

Establishing a variable context

tidyselect provides a few more ways of establishing a variable context:

with_vars() takes variables and an expression and evaluates the latter in the context of the former.

current_vars() has been renamed to peek_vars(). This naming is a reference to peek and poke from legacy languages.

New evaluation semantics

The evaluation semantics for selecting verbs have changed. Symbols are now evaluated in a data-only context that is isolated from the calling environment. This means that you can no longer refer to local variables unless you are explicitly unquoting these variables with !!, which is mostly for expert use.

Note that since dplyr 0.7, helper calls (like starts_with()) obey the opposite behaviour and are evaluated in the calling context isolated from the data context. To sum up, symbols can only refer to data frame objects, while helpers can only refer to contextual objects. This differs from usual R evaluation semantics where both the data and the calling environment are in scope (with the former prevailing over the latter).



tidyverse/tidyselect documentation built on March 14, 2024, 3:16 p.m.