expand | R Documentation |
ICU data as handled by ricu
is mostly comprised of time series data and as
such, several utility functions are available for working with time series
data in addition to a class dedicated to representing time series data (see
ts_tbl()
). Some terminology to begin with: a time series is considered
to have gaps if, per (combination of) ID variable value(s), some time steps
are missing. Expanding and collapsing mean to change between
representations where time steps are explicit or encoded as interval with
start and end times. For sliding window-type operations, slide()
means to
iterate over time-windows, slide_index()
means to iterate over certain
time-windows, selected relative to the index and hop()
means to iterate
over time-windows selected in absolute terms.
expand(
x,
start_var = index_var(x),
end_var = NULL,
step_size = time_step(x),
new_index = start_var,
keep_vars = NULL,
aggregate = FALSE
)
collapse(
x,
id_vars = NULL,
index_var = NULL,
start_var = "start",
end_var = "end",
env = NULL,
as_win_tbl = TRUE,
...
)
has_no_gaps(x)
has_gaps(...)
is_regular(x)
fill_gaps(x, limits = collapse(x), start_var = "start", end_var = "end")
remove_gaps(x)
slide(x, expr, before, after = hours(0L), ...)
slide_index(x, expr, index, before, after = hours(0L), ...)
hop(
x,
expr,
windows,
full_window = FALSE,
lwr_col = "min_time",
upr_col = "max_time",
left_closed = TRUE,
right_closed = TRUE,
eval_env = NULL,
...
)
x |
|
start_var, end_var |
Name of the columns that represent lower and upper windows bounds |
step_size |
Controls the step size used to interpolate between
|
new_index |
Name of the new index column |
keep_vars |
Names of the columns to hold onto |
aggregate |
Function for aggregating values in overlapping intervals |
id_vars, index_var |
ID and index variables |
env |
Environment used as parent to the environment used to evaluate
expressions passes as |
as_win_tbl |
Logical flag indicating whether to return a |
... |
Passed to |
limits |
A table with columns for lower and upper window bounds or a length 2 difftime vector |
expr |
Expression (quoted for |
before, after |
Time span to look back/forward |
index |
A vector of times around which windows are spanned (relative to the index) |
windows |
An |
full_window |
Logical flag controlling how the situation is handled where the sliding window extends beyond available data |
lwr_col, upr_col |
Names of columns (in |
left_closed, right_closed |
Logical flag indicating whether intervals are closed (default) or open. |
eval_env |
Environment in which |
A gap in a ts_tbl
object is a missing time step, i.e. a missing entry in
the sequence seq(min(index), max(index), by = interval)
in at least one
group (as defined by id_vars()
, where the extrema are calculated per
group. In this case, has_gaps()
will return TRUE
. The function
is_regular()
checks whether the time series has no gaps, in addition to
the object being sorted and unique (see is_sorted()
and is_unique()
).
In order to transform a time series containing gaps into a regular time
series, fill_gaps()
will fill missing time steps with NA
values in all
data_vars()
columns, while remove_gaps()
provides the inverse operation
of removing time steps that consist of NA
values in data_vars()
columns.
An expand()
operation performed on an object inheriting from data.table
yields a ts_tbl
where time-steps encoded by columns start_var
and
end_var
are made explicit with values in keep_vars
being appropriately
repeated. The inverse operation is available as collapse()
, which groups
by id_vars
, represents index_var
as group-wise extrema in two new
columns start_var
and end_var
and allows for further data summary using
...
. An aspect to keep in mind when applying expand()
to a win_tbl
object is that values simply are repeated for all time-steps that fall into
a given validity interval. This gives correct results when a win_tbl
for
example contains data on infusions as rates, but might not lead to correct
results when infusions are represented as drug amounts administered over a
given time-span. In such a scenario it might be desirable to evenly
distribute the total amount over the corresponding time steps (currently not
implemented).
Sliding-window type operations are available as slide()
, slide_index()
and hop()
(function naming is inspired by the CRAN package slider
). The
most flexible of the three, hop
takes as input a ts_tbl
object x
containing the data, an id_tbl
object windows
, containing for each ID
the desired windows represented by two columns lwr_col
and upr_col
, as
well as an expression expr
to be evaluated per window. At the other end
of the spectrum, slide()
spans windows for every ID and available
time-step using the arguments before
and after
, while slide_index()
can be seen as a compromise between the two, where windows are spanned for
certain time-points, specified by index
.
Most functions return ts_tbl
objects with the exception of
has_gaps()
/has_no_gaps()
/is_regular()
, which return logical flags.
if (FALSE) {
tbl <- ts_tbl(x = 1:5, y = hours(1:5), z = hours(2:6), val = rnorm(5),
index_var = "y")
exp <- expand(tbl, "y", "z", step_size = 1L, new_index = "y",
keep_vars = c("x", "val"))
col <- collapse(exp, start_var = "y", end_var = "z", val = unique(val))
all.equal(tbl, col, check.attributes = FALSE)
tbl <- ts_tbl(x = rep(1:5, 1:5), y = hours(sequence(1:5)), z = 1:15)
win <- id_tbl(x = c(3, 4), a = hours(c(2, 1)), b = hours(c(3, 4)))
hop(tbl, list(z = sum(z)), win, lwr_col = "a", upr_col = "b")
slide_index(tbl, list(z = sum(z)), hours(c(4, 5)), before = hours(2))
slide(tbl, list(z = sum(z)), before = hours(2))
tbl <- ts_tbl(x = rep(3:4, 3:4), y = hours(sequence(3:4)), z = 1:7)
has_no_gaps(tbl)
is_regular(tbl)
tbl[1, 2] <- hours(2)
has_no_gaps(tbl)
is_regular(tbl)
tbl[6, 2] <- hours(2)
has_no_gaps(tbl)
is_regular(tbl)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.