knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

constellation

Build Status Windows build status

Overview

Constellation contains a set of functions for applying multidimensional, time window based logic to time series data frames of arbitrary length. Constellation was developed to enable rapid and flexible identification of series of events that occur in hospitalized patients. The functions have been abstracted for general purpose use with time series data. Constellation extends and provides a friendly API to rolling joins and overlap joins implemented in data.table. Three datasets (labs, vitals, and orders) with randomly synthesized time series data for a cohort of 100 patients are included to facilitate testing of functions.

There are five functions included in constellation to build complex features from time series data:

The constellate_criteria() and bundle() function are similar, but the bundle() function is anchored around a specific event table. The bundle() function identifies events that occur within a given time window of a specific event data frame that is supplied to the function. On the other hand, the constellate_criteria() function identifies events that occur within a given time window of any event data frame that is supplied to the function. The first data frame passed to the bundle() function is used as an anchor to search through the subsequent data frames passed to the function. The order of data frames is significant and passing different data frames as the first argument will generate different results. On the other hand, the order in which you pass data frames to the constellate_criteria() function is insignificant. Passing data frames in different orders will generate equivalent results.

Constellation can be used to build point-based scores for time series data (via constellate_criteria()), identify particular sequences of events that occur near each other (via constellate()), identify when specific changes occur for a given parameter (via value_change()), identify individual events that occur around a specified time stamp (via bundle()), and distinguish between eveents that are separated by a specified time window (via incidents()).

If you are new to constellation, the best place to start is the vignette("constellation", "identify_sepsis"). You can also view the sepsis vignette on CRAN.

Installation

You can install constellation from CRAN with:

install.packages("constellation")
library(constellation)

You can install the development version of constellation from github with:

devtools::install_github("marksendak/constellation")

If you have any questions, comments, or feedback, please email mark.sendak@gmail.com.

Example

Below are several variations of finding systolic blood pressure drops of 40 over a 6 hour period.

Examine systolic blood pressure data:

library(constellation)

systolic_bp <- vitals[VARIABLE == "SYSTOLIC_BP"]
systolic_bp[, RECORDED_TIME := as.POSIXct(RECORDED_TIME, format = "%Y-%m-%dT%H:%M:%SZ", tz = "UTC")]
head(systolic_bp)

Identify the first systolic blood pressure drop per patient:

systolic_bp_drop <- value_change(systolic_bp, value = 40, direction = "down",
    window_hours = 6, join_key = "PAT_ID", time_var = "RECORDED_TIME", 
    value_var = "VALUE", mult = "first")
head(systolic_bp_drop)

Identify the last systolic blood pressure drop per patient:

systolic_bp_drop <- value_change(systolic_bp, value = 40, direction = "down",
    window_hours = 6, join_key = "PAT_ID", time_var = "RECORDED_TIME", 
    value_var = "VALUE", mult = "last")
head(systolic_bp_drop)

Identify all systolic blood pressure drops per patient:

systolic_bp_drop <- value_change(systolic_bp, value = 40, direction = "down",
    window_hours = 6, join_key = "PAT_ID", time_var = "RECORDED_TIME", 
    value_var = "VALUE", mult = "all")
head(systolic_bp_drop)

Why constellation?

In clinical medicine, there are a subset of conditions that are defined by a sequence of related events that unfold over time. These conditions are described as a "constellation of signs and symptoms."

Another piece of medical jargon that made it into the package is the concept of a treatment bundle. The bundle() function was originally designed to calculate the time stamp at which a group of treatments is delivered for every patient within a specified amount of time of developing a condition.

Duke Institute for Health Innovation

constellation was originally developed to support a machine learning project at the Duke Institute for Health Innovation to predict sepsis.



marksendak/constellation documentation built on May 29, 2019, 12:41 p.m.