R mapping

rquery re-maps a number of symbols during SQL translation.

During expression parsing the internal rquery function tokenize_call_for_SQL() implements the following re-mappings from R idioms to SQL notation.

library("rquery")
library("wrapr")

show_translation <- function(strings) {
  vapply(strings,
         function(si) {
           format(rquery::tokenize_for_SQL(parse(text = si, keep.source = FALSE)[[1]], colnames = NULL)$parsed_toks)
         }, character(1))
}

mapping_table <- data.frame(
  example = c('!x', 'is.na(x)', 'ifelse(a, b, c)', 'a^b', 'a%%b', 
               'a==b', 'a&&b', 'a&b', 'a||b', 'a|b', 
              'pmin(a, b)', 'pmax(a, b)'),
  stringsAsFactors = FALSE)
mapping_table$translation <- show_translation(mapping_table$example)
knitr::kable(mapping_table)

Note: not all possible mappings are implemented. For example we currently do not re-map %in%, preferring the user to explicitly work with set_indicator() directly.

In addition to this the database connectors can specify additional re-mappings. This can be found by building a formal connector and inspecting the re-mappings.

have_RSQLite <- requireNamespace("RSQLite", quietly = TRUE)
raw_RSQLite_connection <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
RSQLite::initExtension(raw_RSQLite_connection)
db <- rquery_db_info(
  connection = raw_RSQLite_connection,
  is_dbi = TRUE,
  connection_options = rq_connection_tests(raw_RSQLite_connection))


fn_name_map <- db$connection_options[[paste0("rquery.", rq_connection_name(db), ".", "fn_name_map")]]
fn_name_map

We see above that "mean" is re-mapped to "avg".

In all cases we can see what re-mappings happen by examining a query.

d_local <- build_frame(
   "subjectID", "surveyCategory"     , "assessmentTotal", "irrelevantCol1", "irrelevantCol2" |
   1L         , "withdrawal behavior", 5                , "irrel1"        , "irrel2"         |
   1L         , "positive re-framing", 2                , "irrel1"        , "irrel2"         |
   3L         , "withdrawal behavior", 3                , "irrel1"        , "irrel2"         |
   3L         , "positive re-framing", 2                , "irrel1"        , "irrel2"         |
   3L         , "other"              , 1                , "irrel1"        , "irrel2"         )
table_handle <- rq_copy_to(db, 'd',
            d_local,
            temporary = TRUE, 
            overwrite = TRUE)
print(table_handle)

ops <- table_handle %.>% 
  project(., 
          avg_total := avg(pmax(0, assessmentTotal)),
          groupby = "subjectID")

cat(to_sql(ops, db))

ops %.>%
  execute(db, .) %.>%
  knitr::kable(.)

The basic mappings are stored in database option structures, and depend on the database. For example MOD is re-mapped back to % for RSQLite.

rquery::rq_function_mappings(db) %.>%
  knitr::kable(.)
ops <- table_handle %.>% 
  project(., groupby = "subjectID",
          n := 5, 
          count := n(),
          mean := mean(assessmentTotal)) %.>% 
  extend(., was_n := n)

cat(to_sql(ops, db))

ops %.>%
  execute(db, .) %.>%
  knitr::kable(.)

Additional function re-mappings can be specified by user code. One such example is re-writing MOD as % for RSQLite.

ops <- table_handle %.>% 
  extend(., z := 1 + subjectID %% 3) %.>%
  select_columns(., c("subjectID", "z"))

cat(to_sql(ops, db))

ops %.>%
  execute(db, .) %.>%
  knitr::kable(.)

rqdatatable also supplies some re-mappings (described here). This can allow us to use a uniform notation for tasks such as random number generation to allow portable pipelines.

DBI::dbDisconnect(raw_RSQLite_connection)


Try the rquery package in your browser

Any scripts or data that you put into this service are public.

rquery documentation built on Aug. 20, 2023, 9:06 a.m.