Query Generation
In rquery: Relational Query Generator for Data Manipulation at Scale

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
run_vignette <- requireNamespace("DBI", quietly = TRUE) && requireNamespace("RSQLite", quietly = TRUE)

This is a very brief place-holder example (not executed, as SQLite does not have the needed window functions). For more details please see the fuller note: rquery README.

The primary purpose of rquery is SQL query generation. We demonstrate this below.

library("rquery")
library("wrapr")

# this db does not have window fns
my_db <- DBI::dbConnect(RSQLite::SQLite(), 
                        ":memory:")

dbopts <- rq_connection_tests(my_db)
print(dbopts)
options(dbopts)

# copy in example data
d_local <- build_frame(
   "subjectID", "surveyCategory"     , "assessmentTotal", "irrelevantCol1", "irrelevantCol2" |
   1          , "withdrawal behavior", 5                , "irrel1"        , "irrel2"         |
   1          , "positive re-framing", 2                , "irrel1"        , "irrel2"         |
   2          , "withdrawal behavior", 3                , "irrel1"        , "irrel2"         |
   2          , "positive re-framing", 4                , "irrel1"        , "irrel2"         )
rq_copy_to(my_db, 'd',
            d_local,
            temporary = TRUE, 
            overwrite = TRUE)

Note: in examples we use rq_copy_to() to create data. This is only for the purpose of having easy portable examples. With big data the data is usually already in the remote database or Spark system. The task is almost always to connect and work with this pre-existing remote data and the method to do this is db_td(), which builds a reference to a remote table given the table name.

# produce a hande to existing table
d <- db_td(my_db, "d")

scale <- 0.237

dq <- d %.>%
  extend(.,
         one = 1) %.>%
  extend(.,
         probability :=
           exp(assessmentTotal * scale)/
           sum(exp(assessmentTotal * scale)),
         count := sum(one),
         partitionby = 'subjectID') %.>%
  extend(.,
         rank := cumsum(one),
         partitionby = 'subjectID',
         orderby = c('probability', 'surveyCategory'))  %.>%
  rename_columns(., 'diagnosis' := 'surveyCategory') %.>%
  select_rows(., rank == count) %.>%
  select_columns(., c('subjectID', 
                      'diagnosis', 
                      'probability')) %.>%
  orderby(., 'subjectID')

class(my_db)

Presentation format (see also op_diagram()):

cat(format(dq))

to_sql() SQL (see also materialize()):

sql <- to_sql(dq, db = my_db, source_limit = 1000)

cat(sql)

DBI::dbDisconnect(my_db)

Any scripts or data that you put into this service are public.

rquery documentation built on Aug. 20, 2023, 9:06 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

rquery
Relational Query Generator for Data Manipulation at Scale

Query Generation
In rquery: Relational Query Generator for Data Manipulation at Scale

Try the rquery package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

rquery Relational Query Generator for Data Manipulation at Scale

Query Generation In rquery: Relational Query Generator for Data Manipulation at Scale

Try the rquery package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

rquery
Relational Query Generator for Data Manipulation at Scale

Query Generation
In rquery: Relational Query Generator for Data Manipulation at Scale