Quick start guide

source(system.file("extdata", "vignettes", "helpers.R", package = "ricu"))

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(ricu)
src  <- "mimic_demo"
demo <- c(src, "eicu_demo")
demo_missing_msg(demo, "start.html")
knitr::opts_chunk$set(eval = FALSE)

In order to set up ricu, download of datasets from several platforms is required. Two data sources, mimic_demo and eicu_demo are available directly as R packages, hosted on Github. The respective full-featured versions mimic and eicu, as well as the hirid dataset are available from PhysioNet, while access to the remaining standard dataset aumc is available from yet another website. The following steps guide through package installation, data source set up and conclude with some example data queries.

Package installation

Stable package releases are available from CRAN as

install.packages("ricu")

and the latest development version is available from GitHub as

remotes::install_github("eth-mds/ricu")

Demo datasets

The demo datasets r paste1(demo) are listed as Suggests dependencies and therefore their availability is determined by the value passed as dependencies to the above package installation function. The following call explicitly installs the demo data set packages

cat(
  "```r\n",
  "install.packages(\n",
  "  c(", paste0("\"", sub("_", ".", demo), "\"", collapse = ", "), "),\n",
  "  repos = \"https://eth-mds.github.io/physionet-demo\"\n",
  ")\n",
  "```\n",
  sep = ""
)

Full datasets

Included with ricu are functions for download and setup of the following datasets: mimic (MIMIC-III), eicu, hirid, aumc and miiv (MIMIC-IV), which can be invoked in several different ways.

Concept loading

Many commonly used clinical data concepts are available for all data sources, where the required data exists. An overview of available concepts is available by calling explain_dictionary() and concepts can be loaded using load_concepts():

<<assign-src>>
<<assign-demo>>

head(explain_dictionary(src = demo))
load_concepts("alb", src, verbose = FALSE)

Concepts representing time-dependent measurements are loaded as ts_tbl objects, whereas static information is retrieved as id_tbl object. Both classes inherit from data.table (and therefore also from data.frame) and can be coerced to any of the base classes using as.data.table() and as.data.frame(), respectively. Using data.table 'by-reference' operations, this is available as zero-copy operation by passing by_ref = TRUE^[While data.table by-reference operations can be very useful due to their inherent efficiency benefits, much care is required if enabled, as they break with the usual base R by-value (copy-on-modify) semantics.].

(dat <- load_concepts("height", src, verbose = FALSE))
head(tmp <- as.data.frame(dat, by_ref = TRUE))
identical(dat, tmp)

Many functions exported by ricu use id_tbl and ts_tbl objects in order to enable more concise semantics. Merging an id_tbl with a ts_tbl, for example, will automatically use the columns identified by id_vars() of both tables, as by.x/by.y arguments, while for two ts_tbl object, respective columns reported by id_vars() and index_var() will be used to merge on.

When loading form multiple data sources simultaneously, load_concepts() will add a source column (which will be among the id_vars() of the resulting object), thereby allowing to identify stay IDs corresponding to the individual data sources.

load_concepts("weight", demo, verbose = FALSE)

Extending the concept dictionary

In addition to the ~100 concepts that are available by default, adding user-defined concepts is possible either as R objects or more robustly, as JSON configuration files.



Try the ricu package in your browser

Any scripts or data that you put into this service are public.

ricu documentation built on Sept. 8, 2023, 5:45 p.m.