source(system.file("extdata", "vignettes", "helpers.R", package = "ricu")) knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(ricu)
src <- "mimic_demo"
demo <- c(src, "eicu_demo")
demo_missing_msg(demo, "start.html") knitr::opts_chunk$set(eval = FALSE)
In order to set up ricu
, download of datasets from several platforms is required. Two data sources, mimic_demo
and eicu_demo
are available directly as R packages, hosted on Github. The respective full-featured versions mimic
and eicu
, as well as the hirid
dataset are available from PhysioNet, while access to the remaining standard dataset aumc
is available from yet another website. The following steps guide through package installation, data source set up and conclude with some example data queries.
Stable package releases are available from CRAN as
install.packages("ricu")
and the latest development version is available from GitHub as
remotes::install_github("eth-mds/ricu")
The demo datasets r paste1(demo)
are listed as Suggests
dependencies and therefore their availability is determined by the value passed as dependencies
to the above package installation function. The following call explicitly installs the demo data set packages
cat( "```r\n", "install.packages(\n", " c(", paste0("\"", sub("_", ".", demo), "\"", collapse = ", "), "),\n", " repos = \"https://eth-mds.github.io/physionet-demo\"\n", ")\n", "```\n", sep = "" )
Included with ricu
are functions for download and setup of the following datasets: mimic
(MIMIC-III), eicu
, hirid
, aumc
and miiv
(MIMIC-IV), which can be invoked in several different ways.
RICU_DATA_PATH
. The current value can be retrieved by calling data_dir()
..csv
form has already been downloaded, this can be decompressed and copied to an appropriate sub-folder (mimic
, eicu
, hirid
or aumc
) to the directory identified by data_dir()
.ricu
download the required data, login credentials can be supplied as environment variables RICU_PHYSIONET_USER
/RICU_PHYSIONET_PASS
and RICU_AUMC_TOKEN
(the string the follows token=
in the download URL received from the AUMCdb data owners) or entered into the terminal manually in interactive sessions.ricu
converts .csv
files into a binary format using the fst package..fst
format (and potentially data download) is automatically triggered upon first access of a table. In interactive sessions, the user is asked for permission to setup the given data source and in non-interactive sessions, access to missing data throws an error.setup_src_data()
.Many commonly used clinical data concepts are available for all data sources, where the required data exists. An overview of available concepts is available by calling explain_dictionary()
and concepts can be loaded using load_concepts()
:
<<assign-src>> <<assign-demo>> head(explain_dictionary(src = demo)) load_concepts("alb", src, verbose = FALSE)
Concepts representing time-dependent measurements are loaded as ts_tbl
objects, whereas static information is retrieved as id_tbl
object. Both classes inherit from data.table
(and therefore also from data.frame
) and can be coerced to any of the base classes using as.data.table()
and as.data.frame()
, respectively. Using data.table
'by-reference' operations, this is available as zero-copy operation by passing by_ref = TRUE
^[While data.table
by-reference operations can be very useful due to their inherent efficiency benefits, much care is required if enabled, as they break with the usual base R by-value (copy-on-modify) semantics.].
(dat <- load_concepts("height", src, verbose = FALSE)) head(tmp <- as.data.frame(dat, by_ref = TRUE)) identical(dat, tmp)
Many functions exported by ricu
use id_tbl
and ts_tbl
objects in order to enable more concise semantics. Merging an id_tbl
with a ts_tbl
, for example, will automatically use the columns identified by id_vars()
of both tables, as by.x
/by.y
arguments, while for two ts_tbl
object, respective columns reported by id_vars()
and index_var()
will be used to merge on.
When loading form multiple data sources simultaneously, load_concepts()
will add a source
column (which will be among the id_vars()
of the resulting object), thereby allowing to identify stay IDs corresponding to the individual data sources.
load_concepts("weight", demo, verbose = FALSE)
In addition to the ~100 concepts that are available by default, adding user-defined concepts is possible either as R objects or more robustly, as JSON configuration files.
Data concepts consist of zero, one, or several data items per data source, encoding how to retrieve the corresponding data. The constructors concept()
and item()
can be used to instantiate concepts as R objects.
r
ldh <- concept("ldh",
item("mimic_demo", "labevents", "itemid", 50954),
description = "Lactate dehydrogenase",
unit = "IU/L"
)
load_concepts(ldh, verbose = FALSE)
Configuration files are looked for in both the package installation directory and in user-specified locations, either using the environment variable RICU_CONFIG_PATH
or by passing paths as function arguments (load_dictionary()
for example accepts a cfg_dirs
argument).
Mechanisms for both extending and replacing existing concept dictionaries are supported by ricu
. The file name of the default concept dictionary is called concept-dict.json
and any file with the same name in user-specified locations will be used as extensions. In order to forgo the internal dictionary, a different file name can be chosen, which then has to be passed as function argument (load_dictionary()
for example has a name
argument which defaults to concept-dict
)
A JSON-based concept akin to the one above can be specified as
{
"ldh": {
"unit": "IU/L",
"description": "Lactate dehydrogenase",
"sources": {
"mimic_demo": [
{
"ids": 50954,
"table": "labevents",
"sub_var": "itemid"
}
]
}
}
}
and this can (given that it is saved as concept-dict.json
in a directory pointed to by RICU_CONFIG_PATH
) then be loaded using load_concepts()
as
r
load_concepts("ldh", "mimic_demo")
For further details on constructing concepts, refer to documentation at ?concept
and ?item
.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.