eval_db <- FALSE
if(Sys.getenv("GLOBAL_EVAL") != "") eval_db <- Sys.getenv("GLOBAL_EVAL")
library(connections)
library(RSQLite)
library(dplyr)
library(dbplyr)

Databases and dplyr

Intro to connections

Use connections to open open a database connection

  1. Load the connections package r library(connections)

  2. Use connection_open() to open a Database connection r {{db_connection}}

  3. The RStudio Connections pane should show the tables in the database

Table reference

Use the dplyr's tbl() command

  1. Load the dplyr package r library(dplyr)

  2. Use the tbl() command to call the customer table r tbl(con, "customer")

  3. Add in_schema() as an argument to tbl() to specify the schema. The default schema is called: "main" r tbl(con, in_schema("main", "customer"))

  4. Load the results from the tbl() command that points to the table called order to a variable called order r order <- tbl(con, "order")

Under the hood

Use show_query() to preview the SQL statement that will be sent to the database

  1. Use show_query() to preview SQL statement that actually runs when we run order as a command r show_query(order)

  2. Easily view the resulting query by adding show_query() in another piped command r order %>% show_query()

  3. Insert head() in between the two statements to see how the SQL changes r order %>% head() %>% show_query()

  4. Use sql_render() and simulate_mssql() to see how the SQL statement changes from vendor to vendor r order %>% head() %>% sql_render(con = simulate_mssql())

  5. Use explain() to explore the query plan r order %>% head() %>% explain()

Un-translated R commands

Review of how dbplyr handles R commands that have not been translated into a like-SQL command

  1. Preview how Sys.time() is translated r order %>% mutate(today = Sys.time()) %>% show_query()

  2. Use SQLite's native commands, in this case date() r order %>% mutate(today = date('now')) %>% show_query()

  3. Run the dplyr code to confirm it works r order %>% mutate(today = date('now')) %>% select(today) %>% head()

Using bang-bang

Intro on passing unevaluated code to a dplyr verb

  1. Preview how Sys.time() is translated when prefixing !! r order %>% mutate(today = !!Sys.time()) %>% show_query()

  2. Preview how Sys.time() is translated when prefixing !! r order %>% mutate(today = !!Sys.time()) %>% select(today) %>% head()



edgararuiz/bigdataclass documentation built on Jan. 3, 2020, 6:46 p.m.