knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
tv::time_varying()
do?Given data X, specs, and exposures
For every patient or row e in the exposures:
Let X = filter the data X to current patient
construct the "grid":
Let f = the features from specs with use_for_grid=TRUE
Let grid_times = the unique datetimes in X with features in f and datetime between e\$exposure_start and e\$exposure_stop
The grid is now a one-row-per-break dataset, with the first break at e\$exposure_start and the last break before e$exposure_end
for grid period g in grid:
for row s in specs:
Let xx = filter data X to X\$feature == s\$feature and X\$datetime in the interval (g\$row_start - s\$lookback_end, g\$row_start - s\$lookback_start)
perform the aggregation s$aggregation on xx
tv
really loop over every single feature for every row in the grid independently?Yes. This is a lot of looping and a good reason to use more than one core.
tv
require any exposure history to get the counts or time-since right?No. Each row is considered independently.
Usually the current grid row start time.
As of version 1.7.0
you can; use tv::time_varying(grid.only = TRUE)
.
tv
on prospective patients really slow?Probably you misunderstand the exposure dataset. You really only want the current exposure in the grid with no other breakpoints, so set your exposure start to (e.g.) the current time and the exposure end to (e.g.) the current time plus one second.
Another example of how to use time-varying. Let's say we want to break every 6 hours. Just add that as a feature. Here we give it a count with infinite look back, to count the 6-hour period we're in. We also want to include the endpoint, which we encode with aggregation "event".
library(tv) library(tibble) library(dplyr) library(lubridate) data <- tribble( ~ pat_id, ~ feature, ~ datetime, ~ value, 1, "lactate", "2021-12-31 23:00:00", 9, 1, "lactate", "2022-01-01 03:41:00", 10, 1, "lactate", "2022-01-01 07:00:00", 11, 1, "blood pressure", "2022-01-01 02:00:00", 120, 1, "blood pressure", "2022-01-01 04:00:00", 115, 1, "blood pressure", "2022-01-01 06:00:00", 118, 1, "6-hour", "2022-01-01 00:00:00", NA_real_, 1, "6-hour", "2022-01-01 06:00:00", NA_real_, 1, "6-hour", "2022-01-01 12:00:00", NA_real_, 1, "6-hour", "2022-01-01 18:00:00", NA_real_, 1, "event", "2022-01-01 08:00:00", NA_real_, 1, "event", "2022-01-01 13:00:00", NA_real_ ) %>% mutate(datetime = as_datetime(datetime)) specs <- tribble( ~ feature, ~ use_for_grid, ~ lookback_start, ~ lookback_end, ~ aggregation, "lactate", TRUE, 0, Inf, "ts", "lactate", TRUE, 0, Inf, "lvcf", "blood pressure", FALSE, 0, 7200, "ts", # two hours "blood pressure", FALSE, 0, 7200, "lvcf", # two hours "6-hour", TRUE, 0, Inf, "n", "event", TRUE, 0, 0, "event" ) exposure <- tibble( pat_id = 1, encounter = 1:2, # optional id exposure_start = as_datetime(c("2022-01-01 00:00:00", "2022-01-01 08:00:00")), exposure_stop = as_datetime(c("2022-01-01 08:00:00", "2022-01-01 13:00:00")), ) time_varying(data, specs, exposure = exposure, time_units = "seconds", n_cores = 1) %>% arrange(pat_id, row_start)
Note that the lactate lab from 2021 does not contribute to a new row because it is not inside the exposure window. Note also that the look back is ignored for aggregation of type "event".
time_varying()
functionRun two tv's and merge the results: one for static variables and one for dynamic variables.
If there are multiple encounters per person, you can just add another row to the exposure data,
and tag it with another id column which gets carried forward by time_varying()
.
Simply pass the special value NA in the "lookback_end" column.
Simply pass the special value NA in the "lookback_start" column. You'll probably also want something like "lookback_end = Inf".
Set use_for_grid=FALSE
for everything except an "hourly" feature. Then calculate every hour that the patient is at risk; set
"feature" to "hourly", set "datetime" to the time stamp, and set "value" to hour(datetime)
.
Start with a dataset of start time and end time. IMPORTANT: be sure no intervals overlap; if they do, the next section would be better suited for you. Otherwise, the pseudocode below should do the trick:
data %>% tidyr::pivot_longer(c(start_time, end_time), names_to = "which_time", values_to = "datetime") %>% dplyr::mutate( value = +(which_time == "start_time") )
Then make the specs a "lvcf" with infinite look back.
Start with a dataset of start time and end time. The pseudocode below should do the trick:
data %>% tidyr::pivot_longer(c(start_time, end_time), names_to = "which_time", values_to = "datetime") %>% dplyr::arrange(pat_id, datetime) %>% dplyr::group_by(pat_id) %>% dplyr::mutate( value = cumsum(ifelse(which_time == "start_time", 1, -1)) ) %>% dplyr::ungroup()
Then make the specs a "lvcf" with infinite look back.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.