epi_slide | R Documentation |
epi_df
objectSlides a given function over variables in an epi_df
object.
This is useful for computations like rolling averages. The function supports
many ways to specify the computation, but by far the most common use case is
as follows:
# Create new column `cases_7dmed` that contains a 7-day trailing median of cases epi_slide(edf, cases_7dmed = median(cases), .window_size = 7)
For two very common use cases, we provide optimized functions that are much
faster than epi_slide
: epi_slide_mean()
and epi_slide_sum()
. We
recommend using these functions when possible.
See vignette("epi_df")
for more examples.
epi_slide(
.x,
.f,
...,
.window_size = NULL,
.align = c("right", "center", "left"),
.ref_time_values = NULL,
.new_col_name = NULL,
.all_rows = FALSE
)
.x |
An |
.f |
Function, formula, or missing; together with
|
... |
Additional arguments to pass to the function or formula specified
via |
.window_size |
The size of the sliding window. The accepted values
depend on the type of the
|
.align |
The alignment of the sliding window.
|
.ref_time_values |
The time values at which to compute the slides
values. By default, this is all the unique time values in |
.new_col_name |
Name for the new column that will contain the computed
values. The default is "slide_value" unless your slide computations output
data frames, in which case they will be unpacked (as in |
.all_rows |
If |
.f
via tidy evaluationIf specifying .f
via tidy evaluation, in addition to the standard .data
and .env
, we make some additional "pronoun"-like bindings available:
.x, which is like .x
in dplyr::group_modify
; an ordinary object
like an epi_df
rather than an rlang pronoun
like .data
; this allows you to use additional dplyr
, tidyr
, and
epiprocess
operations. If you have multiple expressions in ...
, this
won't let you refer to the output of the earlier expressions, but .data
will.
.group_key, which is like .y
in dplyr::group_modify
.
.ref_time_value, which is the element of .ref_time_values
that
determined the time window for the current computation.
An epi_df
object with one or more new slide computation columns
added. It will be ungrouped if .x
was ungrouped, and have the same groups
as .x
if .x
was grouped.
epi_slide_opt
for optimized slide functions
library(dplyr)
# Get the 7-day trailing standard deviation of cases and the 7-day trailing mean of cases
cases_deaths_subset %>%
epi_slide(
cases_7sd = sd(cases, na.rm = TRUE),
cases_7dav = mean(cases, na.rm = TRUE),
.window_size = 7
) %>%
select(geo_value, time_value, cases, cases_7sd, cases_7dav)
# Note that epi_slide_mean could be used to more quickly calculate cases_7dav.
# In addition to the [`dplyr::mutate`]-like syntax, you can feed in a function or
# formula in a way similar to [`dplyr::group_modify`]:
my_summarizer <- function(window_data) {
window_data %>%
summarize(
cases_7sd = sd(cases, na.rm = TRUE),
cases_7dav = mean(cases, na.rm = TRUE)
)
}
cases_deaths_subset %>%
epi_slide(
~ my_summarizer(.x),
.window_size = 7
) %>%
select(geo_value, time_value, cases, cases_7sd, cases_7dav)
#### Advanced: ####
# The tidyverse supports ["packing"][tidyr::pack] multiple columns into a
# single tibble-type column contained within some larger tibble. Like dplyr,
# we normally don't pack output columns together. However, packing behavior can be turned on
# by providing a name for a tibble-type output:
cases_deaths_subset %>%
epi_slide(
slide_packed = tibble(
cases_7sd = sd(.x$cases, na.rm = TRUE),
cases_7dav = mean(.x$cases, na.rm = TRUE)
),
.window_size = 7
) %>%
select(geo_value, time_value, cases, slide_packed)
cases_deaths_subset %>%
epi_slide(
~ tibble(
cases_7sd = sd(.x$cases, na.rm = TRUE),
cases_7dav = mean(.x$cases, na.rm = TRUE)
),
.new_col_name = "slide_packed",
.window_size = 7
) %>%
select(geo_value, time_value, cases, slide_packed)
# You can also get ["nested"][tidyr::nest] format by wrapping your results in
# a list:
cases_deaths_subset %>%
group_by(geo_value) %>%
epi_slide(
function(x, g, t) {
list(tibble(
cases_7sd = sd(x$cases, na.rm = TRUE),
cases_7dav = mean(x$cases, na.rm = TRUE)
))
},
.window_size = 7
) %>%
ungroup() %>%
select(geo_value, time_value, slide_value)
# Use the geo_value or the ref_time_value in the slide computation
cases_deaths_subset %>%
epi_slide(~ .x$geo_value[[1]], .window_size = 7)
cases_deaths_subset %>%
epi_slide(~ .x$time_value[[1]], .window_size = 7)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.