knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
{doltr}
is a package to interface with Dolt, an
SQL database with git-like versioning.
You will need the dolt command-line utility installed on your computer to use {doltr}
. Installation instructions
for Linux, macOS, and Windows can be found here.
Install the R package with
remotes::install_github("ecohealthalliance/doltr")
{doltr}
package provides two DBI-compliant drivers
to connect to a dolt database dolt_remote()
connects to a dolt server via TCP,
and is a thin wrapper around the RMariaDB package
because Dolt shares a communication protocol with MySQL and MariaDB.
dolt_local()
connects to a Dolt database directory locally on-disk. Behind the
scenes dolt_local()
launches and manages a background server process, which
can also be done manually with dolt_server()
. Both can be used to connect to
a database as you would with other DBI packages:
library(doltr) remote_conn <- DBI::dbConnect(dolt_remote(), dname = "dbname", username = "user", ...)
local_conn <- DBI::dbConnect(dolt_local(), dir = "/path/to/my/dolt/db/directory")
Since Dolt has git-like versioning concepts, {doltr}
's API design includes
both components drawn from {DBI}
and also from git interfaces like {gert}
and {git2r}
(as well as Dolt's command-line interface).
{doltr}
has the concept of a "default database" for a project. When
working with git (or {git2r}
or {gert}
), commands apply to the current
working directory by default. Similarly, with {doltr}
, many commands use
the default database. The default database is set with the environment
directory DOLT_DIR
, which is doltdb
by default.
For a project you might set DOLT_DIR
in a project-level .Renviron
or
.env
file
To explore {doltr}
's capabilities, let's pull an existing database. dolt_clone()
,
like git clone
clones a database to a local directory, using DoltHub
as the default remote source (though dolt Database remotes can be hosted elsewhere).
We'll clone doltr/nycflights
, which contains a subset of
the data from the {nycflights13
} package.
if (dir.exists("nycflights")) unlink("nycflights", force = TRUE, recursive = TRUE)
library(doltr) dolt_clone("doltr/nycflights")
This creates an nycflights
directory. Let's set it as our default database for
this session:
Sys.setenv(DOLT_DIR="nycflights")
You can use the dolt()
function to connect to the database. dolt()
is a
shortcut for dbConnect(dolt_local/dolt_remote(), ...)
. It also caches the database
connection, so it can be called repeatedly in place of a connection variable.
dolt()
is also the default argument for a database connection in many functions.
Running dolt()
prints a summary of the database state:
dolt()
You can use dolt()
with {DBI}
or {dbplyr}
functions to read from or write
to the database:
library(DBI) library(dbplyr) library(dplyr)
dbListTables(dolt()) dbReadTable(dolt(), "airlines")
tbl(dolt(), "flights") %>% filter(origin == "EWR", dest == "MDW") %>% head() %>% collect()
dbWriteTable(dolt(), "mtcars", mtcars)
With the last command, we changed the database by adding a table. This is
reflected in the change to the database working state when we print dolt()
dolt()
The summary no longer says "Working database clean" but shows that the working state
of the database now includes a new table called mtcars
. As with a new file in
a git repository, we can stage this table for comitting, with dolt_add()
.
Rather than printing the whole database summary, we can get just the last
bullet with dolt_status()
dolt_add("mtcars") dolt_status()
dolt_status()
pretty-prints but actually yields a table of working or stages
changes to the database:
as.data.frame(dolt_status())
With the table staged, we can commit it to the database and provide a message:
dolt_commit(message = "Add mtcars table") dolt_status() dolt_last_commit()
You can view the commit history of the database with dolt_log()
, which collects
the Dolt system table dolt_log
:
dolt_log()
For RStudio users, {doltr}
provides a connection pane with which you can explore the database.
dolt_pane()
{doltr}
's connection pane shows information about the versioning state of your
database in addition to your tables, the Dolt system tables and the database
information schema.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.