knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This DBI Extention package introduces a new S4 generic dbUpdateTable
. Updating tables is not the focus of the DBI package as pure analysis scripts treat a database tables as read-only.
In other domains, database maintainence requires updating information in rows of an existing table. Dropping the table and recreating it as is done by dbWriteTable with the overwrite
argument set to TRUE
is not an optimal solution for many applications.
No single generic update function could cover the diverse variety of possible UPDATE queries. The dbUpdateTable
S4 generic introduced in this package performs a specific type of DBMS UPDATE operation using the update join syntax. An example of this syntax is shown in the sql fragment shown below. This syntax is used by MariaDB and MySQL.
```{sql sqlUpdateTable, echo=TRUE, eval=FALSE} UPDATE T1 INNER JOIN T2 ON T1.C1 = T2.C1 [AND T1.C2 = T2.C2] SET T1.CX = T2.CX, T1.CY = T2.CY;
reference: [tutorial on UPDATE join syntax for MySQL](http://www.mysqltutorial.org/mysql-update-join/) Note that no `WHERE` clause is specified for the query. This by design; the reasoning for which will be explained later in the document. For `dbUpdateTable()` the table `T1` must be a pre-existing table in the DBMS. Table `T2` will be created as a temporary table in the DBMS using a dataset supplied as a function argument. This dataset will usually be a tibble, a data.frame, or another object coercible to a data.frame. the `dbUpdateTable` function call has two additional arguements `set` and `by` used for constructing the `SET` and `ON` clauses of the UPDATE query. ```r dbUpdateTable(con, "T1", T2, set = c("CX","CY"), by = c("C1","C2"))
There is a helper function sqlUpdateTable which generates the UPDATE sql but does not execute it. Helper functions are also provided to generate the sql for the ON
clause of the join as well as the multiples SET clauses of the UPDATE
. These are sqlOnClause()
and sqlSetClause()
.
T2 <- tibble::as_tibble(list(C1=1:26,C2=LETTERS,CX=0,CY=1)) sqlUpdateTable(con, "T1", T2, set = c("CX","CY"), by = c("C1","C2")) sqlOnClause(con,"T1","T2",by = c("C1","C2")); sqlSetClause(con,"T1","T2",set = c("CX","CY"));
It is best practice that the target DBMS table be indexed, with the indexed columns matching the columns specified by the by
argument. The temporary table used store the update dataset is indexed by the columns in the by
argument as part of the dbUpdateTable()
implementation. The UPDATE-JOIN
operation with indexed table is generally quite fast.
WHERE
caluse?Update queries can use a WHERE
clause to limit the scope of the update. However in an UPDATE-JOIN
the scope of the query is alreadly limited by the ON
clause of the join. This combined with the power of R
to precisely form the update dataset makes use of a WHERE
clause redundent, IMHO.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.