knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(ricu)
library(ggplot2)

on_cran <- function() !identical(Sys.getenv("NOT_CRAN"), "true")

create_filename <- function(base_dir, src) {
  paste0(file.path(base_dir, "extdata", "vignettes", "uom", src), ".rds")
}

load_concepts <- function(concepts, src, ...) {

  cached_file <- create_filename(system.file(package = "ricu"), src)

  if (file.exists(cached_file) && on_cran()) {
    return(readRDS(cached_file))
  }

  res <- ricu::load_concepts(concepts, src, ...)
  dst <- create_filename(file.path("..", "inst"), src)

  if (!dir.exists(dirname(dst))) {
    dir.create(dirname(dst), recursive = TRUE)
  }

  saveRDS(res, dst, version = 2L)

  res
}

Possible mismatch in units

Working with different ICU datasets can be challenging in terms of units of measurement. In particular, combining data from different countries can cause a mismatch in the units, as the practices vary substantially. In particular, we note that the commonly used unit of measurement for laboratory values in the US datasets is mg/dL, as opposed to mmol/L used in European datasets. Note that the conversion between the two requires the molecular weight of the substance and therefore must be handled on case-to-case basis. When loading data, care needs to be taken in light of this possible problem.

ricu approach

All concepts that can be loaded with load_concepts() within ricu have been checked for units, and the units were converted where necessary.

For example, we take 5 different concepts:

data_src <- c("mimic_demo", "eicu_demo")

concepts <- c(
  "map", "lact" , "crea", "bili", "plt"
)

dat <- lapply(data_src,
  function(src) load_concepts(concepts, src, verbose = FALSE)
)

names(dat) <- data_src

dat

We plot the density of the features and report their median values:

take_quants <- function(x, lwr, upr) {
  qq <- quantile(x, probs = c(lwr, upr), na.rm = TRUE)
  x[!is.na(x) & x >= qq[1L] & x <= qq[2L]]
}

for (conc in concepts) {

  feat <- lapply(dat, `[[`, conc)
  meds <- vapply(feat, median, numeric(1L), na.rm = TRUE)
  feat <- Map(data.frame, src = names(feat),
              dat = lapply(feat, take_quants, lwr = 0.05, upr = 0.95))

  title <- paste0(conc, ": median values ",
    paste0(round(meds, 2), " (", names(meds), ")", collapse = ", ")
  )

  print(
    ggplot(Reduce(rbind, feat), aes(x = dat, fill = src)) +
      geom_density(alpha=0.5) +
      xlab(conc) + theme_bw() + ggtitle(title)
  )
}

Note that the matching between datasets is not perfect, but the median values should align closely (the above are done using the demo datasets which are rather small in size; in general the matching should be even better).

Concepts outside ricu dictionary

Not all relevant concepts are included in the ricu dictionary. When loading concepts outside the dictionary, we recommend checking whether the units match across datasets using the density plots and median values as shown above. In particular, if there is a clear difference in the median values, or if the density plots look "multimodal", there is reason to believe some unit conversion is required.



septic-tank/ricu documentation built on Jan. 30, 2021, 8:40 p.m.