as_sqlite_backend: Convert to SQLite Backend

View source: R/as_sqlite_backend.R

as_sqlite_backendR Documentation

Convert to SQLite Backend

Description

Converts to a DataBackendDplyr using a RSQLite database, depending on the input type:

  • data.frame: Creates a new DataBackendDataTable first using as_data_backend(), then proceeds with the conversion from DataBackendDataTable to DataBackendDplyr.

  • mlr3::DataBackend: Creates a new SQLite data base in the specified path. The filename is determined by the hash of the DataBackend. If the file already exists, a connection to the existing database is established and the existing files are reused.

The created backend automatically reconnects to the database if the connection was lost, e.g. because the object was serialized to the filesystem and restored in a different R session. The only requirement is that the path does not change and that the path is accessible on all workers.

Usage

as_sqlite_backend(data, path = getOption("mlr3db.sqlite_dir", ":temp:"), ...)

Arguments

data

(data.frame() | mlr3::DataBackend
See description.

path

(character(1))
Path for the DuckDB databases. Either a valid path to a directory which will be created if it not exists, or one of the special strings:

  • ":temp:" (default): Temporary directory of the R session is used, see tempdir(). Note that this directory will be removed during the shutdown of the R session. Also note that this usually does not work for parallelization on remote workers. Set to a custom path instead or use special string ":user:" instead.

  • ":user:": User cache directory as returned by R_user_dir() is used.

The default for this argument can be configured via option "mlr3db.sqlite_dir" or "mlr3db.duckdb_dir", respectively. The database files will use the hash of the DataBackend as filename with file extension ".duckdb" or ".sqlite". If the database already exists on the file system, the converters will just established a new read-only connection.

...

(any)
Additional arguments, passed to DataBackendDplyr.

Value

DataBackendDplyr or Task.


mlr-org/mlr3db documentation built on Oct. 17, 2023, 11:36 p.m.