In WinVector/rquery: Relational Query Generator for Data Manipulation at Scale

dplyr is inconsistent as to which column is selected unless one uses extra notation such as !!, {{}}, .data[[]], and so on. Of course if using a name or string directly are not the "correct" notation, why are they allowed? Notice how different columns are selected in each example, depending on the columns present in the data.frame. The issue is dplyr does not commit to an unambiguous interpretation of the basic notation (only the more complicated, longer notations have reliable semantics).

library("dplyr")

y = "x"

data.frame(x = 1) %>%
  select(y)

data.frame(x = 1, y = 2) %>%
  select(y)

dplyr notations that are unambiguous include:

data.frame(x = 1) %>%
  select({{y}})

data.frame(x = 1, y = 2) %>%
  select

data.frame(x = 1) %>%
  select(!!y)

data.frame(x = 1, y = 2) %>%
  select(!!y)

data.frame(x = 1) %>%
  select(!!rlang::enquo(y))

data.frame(x = 1, y = 2) %>%
  select(!!rlang::enquo(y))

data.frame(x = 1) %>%
  select(.data[[y]])

data.frame(x = 1, y = 2) %>%
  select(.data[[y]])

But other notations don't work (.data is apparently a mapping from column names to column indices, and not in fact a reference to the incoming data.frame).

data.frame(x = 1) %>%
  select(.data[y])

data.frame(x = 1, y = 2) %>%
  select(.data[y])

R itself does not have this problem. Notice how the column named by y (which turns out to be x) is reliably chosen in all cases. In [] and [[]] notations columns are always values (not taken from code or variable names; and $ always take from code and not from values).

y = "x"

data.frame(x = 1)[y]

data.frame(x = 1, y = 2)[y]

rqdatable also has reliable column selection semantics, columns are always values (not taken from code or variable names).

library("rqdatatable")

y = "x"

data.frame(x = 1) %.>% 
  select_columns(., y)

data.frame(x = 1, y = 2) %.>% 
  select_columns(., y)

WinVector/rquery documentation built on Aug. 24, 2023, 11:12 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

WinVector/rquery
Relational Query Generator for Data Manipulation at Scale

In WinVector/rquery: Relational Query Generator for Data Manipulation at Scale

R Package Documentation

Browse R Packages

We want your feedback!

WinVector/rquery Relational Query Generator for Data Manipulation at Scale

In WinVector/rquery: Relational Query Generator for Data Manipulation at Scale

R Package Documentation

Browse R Packages

We want your feedback!

WinVector/rquery
Relational Query Generator for Data Manipulation at Scale