dplyr
is inconsistent as to which column is selected unless one uses extra notation such as !!
, {{}}
, .data[[]]
, and so on. Of course if using a name or string directly are not the "correct" notation, why are they allowed? Notice how different columns are selected in each example, depending on the columns present in the data.frame
. The issue is dplyr
does not commit to an unambiguous interpretation of the basic notation (only the more complicated, longer notations have reliable semantics).
library("dplyr") y = "x" data.frame(x = 1) %>% select(y) data.frame(x = 1, y = 2) %>% select(y)
dplyr
notations that are unambiguous include:
data.frame(x = 1) %>% select({{y}}) data.frame(x = 1, y = 2) %>% select data.frame(x = 1) %>% select(!!y) data.frame(x = 1, y = 2) %>% select(!!y) data.frame(x = 1) %>% select(!!rlang::enquo(y)) data.frame(x = 1, y = 2) %>% select(!!rlang::enquo(y)) data.frame(x = 1) %>% select(.data[[y]]) data.frame(x = 1, y = 2) %>% select(.data[[y]])
But other notations don't work (.data
is apparently a mapping from column names to column indices, and not in fact a reference to the incoming data.frame
).
data.frame(x = 1) %>% select(.data[y]) data.frame(x = 1, y = 2) %>% select(.data[y])
R
itself does not have this problem. Notice how the column named by y
(which turns out to be x
) is reliably chosen in all cases. In []
and [[]]
notations columns are always values (not taken from code or variable names; and $
always take from code and not from values).
y = "x" data.frame(x = 1)[y] data.frame(x = 1, y = 2)[y]
rqdatable
also has reliable column selection semantics, columns are always values (not taken from code or variable names).
library("rqdatatable") y = "x" data.frame(x = 1) %.>% select_columns(., y) data.frame(x = 1, y = 2) %.>% select_columns(., y)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.