src_cfg: Internal utilities for working with data source...

new_src_cfgR Documentation

Internal utilities for working with data source configurations

Description

Data source configuration objects store information on data sources used throughout ricu. This includes URLs for data set downloading, Column specifications used for data set importing, default values per table for important columns such as index columns when loading data and how different patient identifiers used throughout a dataset relate to another. Per dataset, a src_cfg object is created from a JSON file (see load_src_cfg()), consisting of several helper-classes compartmentalizing the pieces of information outlined above. Alongside constructors for the various classes, several utilities, such as inheritance checks, coercion functions, as well as functions to extract pieces of information from these objects are provided.

Usage

new_src_cfg(name, id_cfg, col_cfg, tbl_cfg, ..., class_prefix = name)

new_id_cfg(
  src,
  name,
  id,
  pos = seq_along(name),
  start = NULL,
  end = NULL,
  table = NULL,
  class_prefix = src
)

new_col_cfg(src, table, ..., class_prefix = src)

new_tbl_cfg(
  src,
  table,
  files = NULL,
  cols = NULL,
  num_rows = NULL,
  partitioning = NULL,
  ...,
  class_prefix = src
)

is_src_cfg(x)

as_src_cfg(x)

is_id_cfg(x)

as_id_cfg(x)

is_col_cfg(x)

as_col_cfg(x)

is_tbl_cfg(x)

as_tbl_cfg(x)

src_name(x)

tbl_name(x)

src_extra_cfg(x)

src_prefix(x)

src_url(x)

id_var_opts(x)

default_vars(x, type)

Arguments

name

Name of the data source

id_cfg

An id_cfg object for the given data source

col_cfg

A list of col_cfg objects representing column defaults for all tables of the

tbl_cfg

A list of tbl_cfg containing information on how tables are organized (may be NULL)

...

Further objects to add (such as an URL specification)

class_prefix

A character vector of class prefixes that are added to the instantiated classes

src

Data source name

id, start, end

Name(s) of ID column(s), as well as respective start and end timestamps

pos

Integer valued position, ordering IDs by their cardinality

table

Table name

cols

List containing a list per column each holding string valued entries name (column name as used by ricu), col (column name as used in the raw data) and spec (name of readr::cols() column specification). Further entries will be passed as argument to the respective readr column specification

num_rows

A count indicating the expected number of rows

partitioning

A table partitioning is defined by a column name and a vector of numeric values that are passed as vec argument to base::findInterval()

x

Object to coerce/query

Details

The following classes are used to represent data source configuration objects:

  • src_cfg: wraps objects id_cfg, col_cfg and optionally tbl_cfg

  • id_cfg: contains information in ID systems and is created from id_cfg entries in config files

  • col_cfg: contains column default settings represented by defaults entries in table configuration blocks

  • tbl_cfg: used when importing data and therefore encompasses information in files, num_rows and cols entries of table configuration blocks

Represented by a col_cfg, a table can have some of its columns marked as default columns for the following concepts and further column meanings can be specified via ...:

  • id_col: column will be used for as id for icu_tbl objects

  • index_col: column represents a timestamp variable and will be use as such for ts_tbl objects

  • val_col: column contains the measured variable of interest

  • unit_col: column specifies the unit of measurement in the corresponding val_col

Alongside constructors (⁠new_*()⁠), inheritance checking functions (⁠is_*()⁠), as well as coercion functions (⁠as_*(⁠), relevant utility functions include:

  • src_url(): retrieve the URL of a data source

  • id_var_opts(): column name(s) corresponding to ID systems

  • src_name(): name of the data source

  • tbl_name(): name of a table

Coercion between objects under some circumstances can yield list-of object return types. For example when coercing src_cfg to tbl_cfg, this will result in a list of tbl_cfg objects, as multiple tables typically correspond to a data source.

Value

Constructors ⁠new_*()⁠ as well as coercion functions ⁠as_*()⁠ return the respective objects, while inheritance tester functions ⁠is_*()⁠ return a logical flag.

  • src_url(): string valued data source URL

  • id_var_opts(): character vector of ID variable options

  • src_name(): string valued data source name

  • tbl_name(): string valued table name


ricu documentation built on Sept. 8, 2023, 5:45 p.m.