View source: R/parsnip-window_reg.R
window_reg | R Documentation |
window_reg()
is a way to generate a specification of a window model
before fitting and allows the model to be created using
different backends.
window_reg(mode = "regression", id = NULL, window_size = NULL)
mode |
A single character string for the type of model. The only possible value for this model is "regression". |
id |
An optional quoted column name (e.g. "id") for identifying multiple time series (i.e. panel data). |
window_size |
A window to apply the window function. By default, the window uses the full data set, which is rarely the best choice. |
A time series window regression is derived using window_reg()
.
The model can be created using the fit()
function using the
following engines:
"window_function" (default) - Performs a Window Forecast
applying a window_function
(engine parameter)
to a window of size defined by window_size
function (default engine)
The engine uses window_function_fit_impl()
. A time series window function
applies a window_function
to a window of the data (last N observations).
The function can return a scalar (single value) or multiple values that are repeated for each window
Common use cases:
Moving Average Forecasts: Forecast forward a 20-day average
Weighted Average Forecasts: Exponentially weighting the most recent observations
Median Forecasts: Forecasting forward a 20-day median
Repeating Forecasts: Simulating a Seasonal Naive Forecast by broadcasting the last 12 observations of a monthly dataset into the future
The key engine parameter is the window_function
. A function / formula:
If a function, e.g. mean
, the function is used with
any additional arguments, ...
in set_engine()
.
If a formula, e.g. ~ mean(., na.rm = TRUE)
, it is converted to a function.
This syntax allows you to create very compact anonymous functions.
Date and Date-Time Variable
It's a requirement to have a date or date-time variable as a predictor.
The fit()
interface accepts date and date-time features and handles them internally.
fit(y ~ date)
ID features (Multiple Time Series, Panel Data)
The id
parameter is populated using the fit()
or fit_xy()
function:
ID Example: Suppose you have 3 features:
y
(target)
date
(time stamp),
series_id
(a unique identifer that identifies each time series in your data).
The series_id
can be passed to the window_reg()
using
fit()
:
window_reg(id = "series_id")
specifes that the series_id
column should be used
to identify each time series.
fit(y ~ date + series_id)
will pass series_id
on to the underlying functions.
Window Function Specification (window_function)
You can specify a function / formula using purrr
syntax.
If a function, e.g. mean
, the function is used with
any additional arguments, ...
in set_engine()
.
If a formula, e.g. ~ mean(., na.rm = TRUE)
, it is converted to a function.
This syntax allows you to create very compact anonymous functions.
Window Size Specification (window_size)
The period can be non-seasonal (window_size = 1 or "none"
) or
yearly seasonal (e.g. For monthly time stamps, window_size = 12
, window_size = "12 months"
, or window_size = "yearly"
).
There are 3 ways to specify:
window_size = "all"
: A seasonal period is selected based on the periodicity of the data (e.g. 12 if monthly)
window_size = 12
: A numeric frequency. For example, 12 is common for monthly data
window_size = "1 year"
: A time-based phrase. For example, "1 year" would convert to 12 for monthly data.
External Regressors (Xregs)
These models are univariate. No xregs are used in the modeling process.
fit.model_spec()
, set_engine()
library(dplyr)
library(parsnip)
library(rsample)
library(timetk)
# Data
m750 <- m4_monthly %>% filter(id == "M750")
m750
# Split Data 80/20
splits <- initial_time_split(m750, prop = 0.8)
# ---- WINDOW FUNCTION -----
# Used to make:
# - Mean/Median forecasts
# - Simple repeating forecasts
# Median Forecast ----
# Model Spec
model_spec <- window_reg(
window_size = 12
) %>%
# Extra parameters passed as: set_engine(...)
set_engine(
engine = "window_function",
window_function = median,
na.rm = TRUE
)
# Fit Spec
model_fit <- model_spec %>%
fit(log(value) ~ date, data = training(splits))
model_fit
# Predict
# - The 12-month median repeats going forward
predict(model_fit, testing(splits))
# ---- PANEL FORECAST - WINDOW FUNCTION ----
# Weighted Average Forecast
model_spec <- window_reg(
# Specify the ID column for Panel Data
id = "id",
window_size = 12
) %>%
set_engine(
engine = "window_function",
# Create a Weighted Average
window_function = ~ sum(tail(.x, 3) * c(0.1, 0.3, 0.6)),
)
# Fit Spec
model_fit <- model_spec %>%
fit(log(value) ~ date + id, data = training(splits))
model_fit
# Predict: The weighted average (scalar) repeats going forward
predict(model_fit, testing(splits))
# ---- BROADCASTING PANELS (REPEATING) ----
# Simulating a Seasonal Naive Forecast by
# broadcasted model the last 12 observations into the future
model_spec <- window_reg(
id = "id",
window_size = Inf
) %>%
set_engine(
engine = "window_function",
window_function = ~ tail(.x, 12),
)
# Fit Spec
model_fit <- model_spec %>%
fit(log(value) ~ date + id, data = training(splits))
model_fit
# Predict: The sequence is broadcasted (repeated) during prediction
predict(model_fit, testing(splits))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.