View source: R/layer_cdc_flatline_quantiles.R
layer_cdc_flatline_quantiles | R Documentation |
This layer creates quantile forecasts by taking a sample from the interpolated CDF of the flatline residuals, and shuffling them. These are then added on to the point prediction.
layer_cdc_flatline_quantiles(
frosting,
...,
aheads = 1:4,
quantile_levels = c(0.01, 0.025, 1:19/20, 0.975, 0.99),
nsims = 1000,
by_key = "geo_value",
symmetrize = FALSE,
nonneg = TRUE,
id = rand_id("cdc_baseline_bands")
)
frosting |
a |
... |
Unused, include for consistency with other layers. |
aheads |
Numeric vector of desired forecast horizons. These should be
given in the "units of the training data". So, for example, for data
typically observed daily (possibly with missing values), but with weekly
forecast targets, you would use |
quantile_levels |
Numeric vector of probabilities with values in (0,1) referring to the desired predictive intervals. The default is the standard set for the COVID Forecast Hub. |
nsims |
Positive integer. The number of draws from the empirical CDF.
These samples are spaced evenly on the (0, 1) scale, F_X(x) resulting in
linear interpolation on the X scale. This is achieved with
|
by_key |
A character vector of keys to group the residuals by before
calculating quantiles. The default, |
symmetrize |
Scalar logical. If |
nonneg |
Scalar logical. Force all predictive intervals be non-negative.
Because non-negativity is forced before propagating forward, this has
slightly different behaviour than would occur if using |
id |
a random id string |
This layer is intended to be used in concert with flatline()
. But it can
also be used with anything else. As long as residuals are available in the
the fitted model, this layer could be useful. Like
layer_residual_quantiles()
it only uses the residuals for the fitted model
object. However, it propagates these forward for all aheads, by
iteratively shuffling them (randomly), and then adding them to the previous
set. This is in contrast to what happens with the flatline_forecaster()
.
When using flatline()
as the underlying engine (here), both will result in the
same predictions (the most recent observed value), but that model calculates
separate residuals for each ahead
by comparing to observations further into
the future. This version continues to use the same set of residuals, and
adds them on to produce wider intervals as ahead
increases.
an updated frosting
postprocessor. Calling predict()
will result
in an additional <list-col>
named .pred_distn_all
containing 2-column
tibble::tibble()
's. For each
desired combination of key
's, the tibble will contain one row per ahead
with the associated dist_quantiles()
.
library(recipes)
r <- epi_recipe(covid_case_death_rates) %>%
# data is "daily", so we fit this to 1 ahead, the result will contain
# 1 day ahead residuals
step_epi_ahead(death_rate, ahead = 1L, skip = TRUE) %>%
update_role(death_rate, new_role = "predictor") %>%
add_role(time_value, geo_value, new_role = "predictor")
forecast_date <- max(covid_case_death_rates$time_value)
f <- frosting() %>%
layer_predict() %>%
layer_cdc_flatline_quantiles(aheads = c(7, 14, 21, 28), symmetrize = TRUE)
eng <- linear_reg(engine = "flatline")
wf <- epi_workflow(r, eng, f) %>% fit(covid_case_death_rates)
preds <- forecast(wf) %>%
select(-time_value) %>%
mutate(forecast_date = forecast_date)
preds
preds <- preds %>%
tidyr::unnest(.pred_distn_all) %>%
pivot_quantiles_wider(.pred_distn) %>%
mutate(target_date = forecast_date + ahead)
library(ggplot2)
four_states <- c("ca", "pa", "wa", "ny")
preds %>%
filter(geo_value %in% four_states) %>%
ggplot(aes(target_date)) +
geom_ribbon(aes(ymin = `0.1`, ymax = `0.9`), fill = blues9[3]) +
geom_ribbon(aes(ymin = `0.25`, ymax = `0.75`), fill = blues9[6]) +
geom_line(aes(y = .pred), color = "orange") +
geom_line(
data = covid_case_death_rates %>% filter(geo_value %in% four_states),
aes(x = time_value, y = death_rate)
) +
scale_x_date(limits = c(forecast_date - 90, forecast_date + 30)) +
labs(x = "Date", y = "Death rate") +
facet_wrap(~geo_value, scales = "free_y") +
theme_bw() +
geom_vline(xintercept = forecast_date)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.