tidyr_tidy_select: Argument type: tidy-select

tidyr_tidy_selectR Documentation

Argument type: tidy-select

Description

This page describes the ⁠<tidy-select>⁠ argument modifier which indicates that the argument uses tidy selection, a sub-type of tidy evaluation. If you've never heard of tidy evaluation before, start with the practical introduction in https://r4ds.hadley.nz/functions.html#data-frame-functions then then read more about the underlying theory in https://rlang.r-lib.org/reference/topic-data-mask.html.

Overview of selection features

tidyselect implements a DSL for selecting variables. It provides helpers for selecting variables:

  • var1:var10: variables lying between var1 on the left and var10 on the right.

  • starts_with("a"): names that start with "a".

  • ends_with("z"): names that end with "z".

  • contains("b"): names that contain "b".

  • matches("x.y"): names that match regular expression x.y.

  • num_range(x, 1:4): names following the pattern, x1, x2, ..., x4.

  • all_of(vars)/any_of(vars): matches names stored in the character vector vars. all_of(vars) will error if the variables aren't present; any_of(var) will match just the variables that exist.

  • everything(): all variables.

  • last_col(): furthest column on the right.

  • where(is.numeric): all variables where is.numeric() returns TRUE.

As well as operators for combining those selections:

  • !selection: only variables that don't match selection.

  • selection1 & selection2: only variables included in both selection1 and selection2.

  • selection1 | selection2: all variables that match either selection1 or selection2.

Key techniques

  • If you want the user to supply a tidyselect specification in a function argument, you need to tunnel the selection through the function argument. This is done by embracing the function argument {{ }}, e.g unnest(df, {{ vars }}).

  • If you have a character vector of column names, use all_of() or any_of(), depending on whether or not you want unknown variable names to cause an error, e.g unnest(df, all_of(vars)), unnest(df, !any_of(vars)).

  • To suppress ⁠R CMD check⁠ NOTEs about unknown variables use "var" instead of var:

# has NOTE
df %>% select(x, y, z)

# no NOTE
df %>% select("x", "y", "z")

tidyverse/tidyr documentation built on Jan. 28, 2024, 12:10 a.m.