prep_nested | R Documentation |
A set of functions to simplify preparation of nested data for iterative (nested) forecasting with Nested Modeltime Tables.
extend_timeseries(.data, .id_var, .date_var, .length_future, ...)
nest_timeseries(.data, .id_var, .length_future, .length_actual = NULL)
split_nested_timeseries(.data, .length_test, .length_train = NULL, ...)
.data |
A data frame or tibble containing time series data. The data should have:
|
.id_var |
An id column |
.date_var |
A date or datetime column |
.length_future |
Varies based on the function:
|
... |
Additional arguments passed to the helper function. See details. |
.length_actual |
Can be used to slice the |
.length_test |
Defines the length of the test split for evaluation. |
.length_train |
Defines the length of the training split for evaluation. |
Preparation of nested time series follows a 3-Step Process:
extend_timeseries()
: A wrapper for timetk::future_frame()
that extends a time series
group-wise into the future.
The group column is specified by .id_var
.
The date column is specified by .date_var
.
The length into the future is specified with .length_future
.
The ...
are additional parameters that can be passed to timetk::future_frame()
nest_timeseries()
: A helper for nesting your data into .actual_data
and .future_data
.
The group column is specified by .id_var
The .length_future
defines the length of the .future_data
.
The remaining data is converted to the .actual_data
.
The .length_actual
can be used to slice the .actual_data
to a most recent number of observations.
The result is a "nested data frame".
split_nested_timeseries()
: A wrapper for timetk::time_series_split()
that generates
training/testing splits from the .actual_data
column.
The .length_test
is the primary argument that identifies the size of the
testing sample. This is typically the same size as the .future_data
.
The .length_train
is an optional size of the training data.
The ...
(dots) are additional arguments that can be passed to timetk::time_series_split()
.
extract_nested_train_split()
and extract_nested_test_split()
are used to simplify extracting
the training and testing data from the actual data. This can be helpful when making
preprocessing recipes using the recipes
package.
library(dplyr)
library(timetk)
nested_data_tbl <- walmart_sales_weekly %>%
select(id, date = Date, value = Weekly_Sales) %>%
# Step 1: Extends the time series by id
extend_timeseries(
.id_var = id,
.date_var = date,
.length_future = 52
) %>%
# Step 2: Nests the time series into .actual_data and .future_data
nest_timeseries(
.id_var = id,
.length_future = 52
) %>%
# Step 3: Adds a column .splits that contains training/testing indices
split_nested_timeseries(
.length_test = 52
)
nested_data_tbl
# Helpers: Getting the Train/Test Sets
extract_nested_train_split(nested_data_tbl, .row_id = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.