knitr::opts_chunk$set(connection = "con", max.print = 5)
eval_dbi <- FALSE
if(Sys.getenv("GLOBAL_EVAL") != "") eval_dbi <- Sys.getenv("GLOBAL_EVAL")
library(DBI)

Introduction to DBI

Local database basics

Connecting and adding data to a database

  1. Load the DBI package r library(DBI)

  2. Use dbConnect to open a database connection r con <- dbConnect(RSQLite::SQLite(), "mydatabase.sqlite")

  3. Use dbListTables() to view existing tables, there should be 0 tables r dbListTables(con)

  4. Use dbWriteTable() to create a new table using mtcars data. Name it db_mtcars r dbWriteTable(con, "db_mtcars", mtcars)

  5. Use dbListTables() to view existing tables, it should return db_mtcars r dbListTables(con)

  6. Use dbGetQuery() to pass a SQL query to the database r dbGetQuery(con, "select * from db_mtcars")

  7. Close the database connection using dbDisconnect() r dbDisconnect(con)

Options for writing tables

Understand how certain arguments in dbWriteTable() work

  1. Use dbConnect() to open a Database connection again r con <- dbConnect(RSQLite::SQLite(), "mydatabase.sqlite")

  2. Use dbWriteTable() to re-create the db_mtcars table using mtcars data r dbWriteTable(con, "db_mtcars", mtcars)

 Error: Table db_mtcars exists in database, and both overwrite and append are FALSE
  1. Use the append argument in dbWriteTable() to add to the data in the db_mtcars table r dbWriteTable(con, "db_mtcars", mtcars, append = TRUE)

  2. Using dbGetQuery(), check the current record count of db_mtcars with the following query: "select count() from db_mtcars" r dbGetQuery(con, "select count() from db_mtcars")

  3. Use the overwrite argument to dbWriteTable() to replace the data in the db_mtcars table r dbWriteTable(con, "db_mtcars", mtcars, overwrite = TRUE)

  4. Check the record count of db_mtcars again r dbGetQuery(con, "select count() from db_mtcars")

Database operations

Understand how to use dbSendStatement() and dbExecute() to modify the database

  1. Use dbSendStatement() to pass a SQL commands that deletes any automatic car from db_mtcars: "delete from db_mtcars where am = 1". Load the command to a variable called rs r rs <- dbSendStatement(con, "delete from db_mtcars where am = 1")

  2. Call the rs variable to view information about the results of the requested change r rs

  3. Use dbHasCompleted() to confirm that the job is complete r dbHasCompleted(rs)

  4. Use dbGetRowsAffected() to see the number of rows that were affected by the request r dbGetRowsAffected(rs)

  5. Clear the results using dbClearResult() r dbClearResult(rs)

  6. Confirm that the result set has been removed by calling the rs variable once more r rs

  7. Check the record count of db_mtcars again, the new count should be 19 (32 original records - 13 deleted records) r dbGetQuery(con, "select count() from db_mtcars")

  8. Use dbWriteTable() to overwrite db_mtcars with the value of mtcars r dbWriteTable(con, "db_mtcars", mtcars, overwrite = TRUE)

  9. Use dbExeceute() to delete rows where am = 1 using the same query as before. Load the results in a variable called rs r rs <- dbExecute(con, "delete from db_mtcars where am = 1")

  10. rs contains the number of rows affected by the statement that was executed r rs

  11. Check the record count of db_mtcars again. r dbGetQuery(con, "select count() from db_mtcars")

knitr SQL engine

See how to run SQL queries as code chunks

  1. Start a new code chunk, but using sql instead of r as the first argument of the chunk. Also add connection=con as another argument of the chunk.

    r ''````{sql, connection=con} select * from db_mtcarsr ''````

```{sql, connection=con, echo = FALSE} select * from db_mtcars

1. Add the `max.print` options to the chunk, and set it to 5

    `r ''````{sql, connection=con, max.print = 5}
    select * from db_mtcars
    `r ''````

```{sql, connection=con, echo = FALSE, max.print = 5}
select * from db_mtcars
  1. Set defaults for the sql chunks by using the knitr::opts_chunk$set() command in the setup at the beginning of the document. r ''````r knitr::opts_chunk$set(connection = "con", max.print = 5)r ''````

  2. Run the same query in a new sql chunk, but without any other argument

    r ''````{sql} select * from db_mtcarsr ''````

```{sql, echo = FALSE} select * from db_mtcars

1. Store the results of the query into an R object called `local_mtcars` using 
the `output.var` option.
```{sql, output.var = "local_mtcars"}
select * from db_mtcars
local_mtcars
  1. Close the database connection using dbDisconnect()
dbDisconnect(con)


rstudio-conf-2020/big-data documentation built on Feb. 4, 2020, 5:24 p.m.