knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%", error = FALSE )
The dbcooper package turns a database connection into a collection of functions, handling logic for keeping track of connections and letting you take advantage of autocompletion when exploring a database.
It's especially helpful to use when authoring database-specific R packages, for instance in an internal company package or one wrapping a public data source.
The package's name is a reference to the bandit D.B. Cooper.
You can install the development version from GitHub with:
# install.packages("devtools") devtools::install_github("pipeline-tools/dbcooper")
The dbcooper package asks you to create the connection first. As an example, we'll use the Lahman baseball database packaged with dbplyr.
library(dplyr) lahman_db <- dbplyr::lahman_sqlite() lahman_db
You set up dbcooper with the dbc_init
function, passing it a prefix lahman
that will apply to all the functions it creates.
library(dbcooper) dbc_init(lahman_db, "lahman")
dbc_init
then creates user-friendly accessor functions in your global environment. (You could also pass it an environment in which the functions will be created).
dbc_init
adds several functions when it initializes a database source. In this case, each will start with the lahman_
prefix.
_list
: Get a list of tables_tbl
: Access a table that can be worked with in dbplyr_query
: Perform of a SQL query and work with the result_execute
: Execute a query (such as a CREATE
or DROP
)_src
: Retrieve a dbi_src
for the databaseFor instance, we could start by finding the names of the tables in the Lahman database.
lahman_list()
We can access one of these tables with lahman_tbl()
, then put it through any kind of dplyr operation.
lahman_tbl("Batting") lahman_tbl("Batting") %>% count(teamID, sort = TRUE)
If we'd rather write SQL than dplyr, we could also run lahman_query()
(which can also take a filename).
lahman_query("SELECT playerID, sum(AB) as AB FROM Batting GROUP BY playerID")
Finally, lahman_execute()
is for commands like CREATE
and DROP
that don't return a table, but rather execute a command on the database.
lahman_execute("CREATE TABLE Players AS SELECT playerID, SUM(AB) AS AB FROM Batting GROUP BY playerID") lahman_tbl("Players") lahman_execute("DROP TABLE Players")
Besides the _list
, _tbl
, _query
, and _execute
functions, the package also creates auto-completed table accessors.
# Same result as lahman_tbl("Batting") lahman_batting() # Same result as lahman_tbl("Managers") %>% count() lahman_managers() %>% count()
These are useful because they let you use auto-complete to complete table names as you're exploring a data source.
Please note that the 'dbcooper' project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.