as_duckdb_backend: Convert to DuckDB Backend

View source: R/as_duckdb_backend.R

as_duckdb_backendR Documentation

Convert to DuckDB Backend

Description

Converts to a DataBackendDuckDB using the duckdb database, depending on the input type:

  • data.frame: Creates a new DataBackendDataTable first using as_data_backend(), then proceeds with the conversion from DataBackendDataTable to DataBackendDuckDB.

  • mlr3::DataBackend: Creates a new DuckDB data base in the specified path. The filename is determined by the hash of the DataBackend. If the file already exists, a connection to the existing database is established and the existing files are reused.

The created backend automatically reconnects to the database if the connection was lost, e.g. because the object was serialized to the filesystem and restored in a different R session. The only requirement is that the path does not change and that the path is accessible on all workers.

Usage

as_duckdb_backend(data, path = getOption("mlr3db.duckdb_dir", ":temp:"), ...)

Arguments

data

(data.frame() | mlr3::DataBackend)
See description.

path

(character(1))
Path for the DuckDB databases. Either a valid path to a directory which will be created if it not exists, or one of the special strings:

  • ":temp:" (default): Temporary directory of the R session is used, see tempdir(). Note that this directory will be removed during the shutdown of the R session. Also note that this usually does not work for parallelization on remote workers. Set to a custom path instead or use special string ":user:" instead.

  • ":user:": User cache directory as returned by R_user_dir() is used.

The default for this argument can be configured via option "mlr3db.sqlite_dir" or "mlr3db.duckdb_dir", respectively. The database files will use the hash of the DataBackend as filename with file extension ".duckdb" or ".sqlite". If the database already exists on the file system, the converters will just established a new read-only connection.

...

(any)
Additional arguments, passed to DataBackendDuckDB.

Value

DataBackendDuckDB or Task.


mlr3db documentation built on Nov. 4, 2023, 5:06 p.m.