dbi.table-package: DBI Table
In dbi.table: Database Queries Using 'data.table' Syntax

dbi.table-package

R Documentation

DBI Table

Description

A dbi.table is a data structure that describes a SQL query (called the dbi.table's underlying SQL query). This query can be manipulated using data.table's [i, j, by] syntax.

Usage

dbi.table(conn, id, check.names = FALSE, key = NULL, stringsAsFactors = FALSE)

## S3 method for class 'dbi.table'
x[i, j, by, keyby, nomatch = NA, on = NULL]

Arguments

`conn`	A `DBIConnection` object, as returned by `dbConnect`. Alternatively, a `dbi.catalog` or a `dbi.table`, in which case the new `dbi.table` will use the connection embedded in the provided object.
`id`	An `Id`, a character string (which will be converted to an `Id` by `Id`), or a `SQL` object (advanced) identifying a database object (e.g., table or view) on `conn`.
`check.names`	Just as `check.names` in `data.table` and `data.frame`.
`key`	A character vector of one or more column names to set as the resulting `dbi.table`'s key.
`stringsAsFactors`	A logical value (default is `FALSE`). Convert all `character` columns to `factor`s when executing the `dbi.table`'s underlying SQL query and retrieving the result set.
`x`	A `dbi.table`.
`i`	A logical expression of the columns of `x`, a `dbi.table`, or a `data.frame`. Use `i` to select a subset of the rows of `x`. Note: unlike `data.table`, `i` cannot be a vector. When `i` is a logical expression, the rows where the expression is `TRUE` are returned. If the expression contains a symbol `foo` that is not a column name of `x` but that is present in the calling scope, then the value of `foo` will be substituted into the expression if `foo` is a scalar, or if `foo` is a vector and is the right-hand-side argument to `%in%` or `%chin%` (substitution occurs when the extract (`[`) method is evaluated). When `i` inherits from `data.frame`, it is coerced to a `dbi.table`. When `i` is a `dbi.table`, the rows of `x` that match (according to the condition specificed in `on`) the rows of `i` are returned. When `nomatch == NA`, all rows of `i` are returned (right outer join); when `nomatch == NULL`, only the rows of `i` that match a row of `x` are returned (inner join).
`j`	A list of expressions, a literal character vector of column names of `x`, an expression of the form `start_name:end_name`, or a literal numeric vector of integer values indexing the columns of `x`. Use `j` to select (and optionally, transform) the columns of `x`.
`by`	A list of expressions, a literal character vector of column names of `x`, an expression of the form `start_name:end_name`, or a literal numeric vector of integer values indexing the columns of `x`. Use `by` to control grouping when evaluating `j`.
`keyby`	Same as `by`, but additionally sets the key of the resulting `dbi.table` to the columns provided in `by`. May also be `TRUE` or `FALSE` when `by` is provided as an alternative way to accomplish the same operation.
`nomatch`	Either `NA` or `NULL`.
`on`	An unnamed character vector, e.g., `x[i, on = c("a", "b")]`, used when columns `a` and `b` are common to both `x` and `i`. Foreign key joins: As a named character vector when the join columns have different names in `x` and `i`. For example, `x[i, on = c(x1 = "i1", x2 = "i2")]` joins `x` and `i` by matching columns `x1` and `x2` in `x` with columns `i1` and `i2` in `i`, respectively. Foreign key joins can also use the binary operator `==`, e.g., `x[i, on = c("x1 == i1", "x2 == i2")]`. It is also possible to use `.()` syntax as `x[i, on = .(a, b)]`. Non-equi joins using binary operators `>=`, `>`, `<=`, `<` are also possible, e.g., `x[i, on = c("x >= a", "y <= b")]`, or `x[i, on = .(x >= a, y <= b)]`.

Value

A dbi.table.

Keys

A key marks a dbi.table as sorted with an attribute "sorted". The sorted columns are the key. The key can be any number of columns. Unlike data.table, the underlying data are not physically sorted, so there is no performance improvement. However, there remain benefits to using keys:

The key provides a default order for window queries so that functions like shift and cumsum give reproducible output.
dbi.table's merge method uses a dbi.table's key to determin the default columns to merge on in the same way that data.table's merge method does. Note: if a dbi.table has a foreign key relationship, that will be used to determin the default columns to merge on before the dbi.table's key is considered.

A table's primary key is used as the default key when it can be determined.

Differences vs. data.table Keys

There are a few key differences between dbi.table keys and data.table keys.

In data.table, NAs are always first. Some databases (e.g., PostgreSQL) sort NULLs last by default and some databases (e.g., SQLite) sort them first. as.data.frame does not change the order of the result set returned by the database. Note that as.data.table uses the dbi.table's key so that the resulting data.table is sorted in the usual data.table way.
The sort is not stable: the order of ties may change on subsequent evaluations of the dbi.table's underlying SQL query.

Strict Processing of Keys

By default, when previewing data (dbi.table's print method), the key is not included in the underlying SQL query's ORDER BY clause. However, the result set is sorted locally to resepct the key. This behavior is referred to as a non-strict evaluation of the key and the printed output labels the key (non-strict). To override the default behavior for a single preview, call print explicitly and provide the optional argument strict = TRUE. To change the default behavior, set the option dbitable.print.strict to TRUE.

Non-strict evaluation of keys reduces the time taken to retrieve the preview.

Examples

  # open a connection to the Chinook example database using duckdb
  duck <- chinook.duckdb()

  # create a dbi.table corresponding to the Album table on duck
  Album <- dbi.table(duck, DBI::Id(table_name = "Album"))

  # the print method displays a 5 row preview
  # print(Album)
  Album

  # 'id' can also be 'SQL'; use the same DBI connection as Album
  Genre <- dbi.table(Album, DBI::SQL("chinook_duckdb.main.Genre"))

  # use the extract ([...]) method to subset the dbi.table
  Album[AlbumId < 5, .(Title, nchar = paste(nchar(Title), "characters"))]

  # use csql to see the underlying SQL query
  csql(Album[AlbumId < 5, #WHERE
             .(Title, #SELECT
               nchar = paste(nchar(Title), "characters"))])

dbi.table documentation built on June 29, 2025, 1:07 a.m.