make_split: Create train-test splits for time series cross-validation

View source: R/make_split.R

make_splitR Documentation

Create train-test splits for time series cross-validation

Description

Create a split frame with train and test indices for one or more time series.

Usage

make_split(
  main_frame,
  context,
  type,
  value,
  n_ahead,
  n_skip = 0,
  n_lag = 0,
  mode = "slide",
  exceed = TRUE
)

Arguments

main_frame

A tibble containing the time series data.

context

A named list with the identifiers for series_id, value_id, and index_id.

type

Character value. The type of initial split. Possible values are "first", "last", and "prob".

value

Numeric value specifying the initial split.

n_ahead

Integer. The forecast horizon, i.e. the number of observations in each test window.

n_skip

Integer. The number of observations to skip between split origins. The default is 0.

n_lag

Integer. The number of lagged observations to include before the test window. This is useful if lagged predictors are required when constructing test features. The default is 0.

mode

Character value. Either "slide" for a fixed-window approach or "stretch" for an expanding-window approach.

exceed

Logical value. If TRUE, out-of-sample splits exceeding the original sample size are created.

Details

make_split() creates rolling-origin train-test splits for time series cross-validation. The output is used by functions such as slice_train() and slice_test() to extract the corresponding training and testing samples from main_frame.

The function supports two training-window modes:

  • mode = "slide" creates a fixed-window approach. The training window has constant length and moves forward over time.

  • mode = "stretch" creates an expanding-window approach. The training window starts at the first observation and grows over time.

The initial training window is controlled by type and value:

  • type = "first" uses the first value observations as the initial training window.

  • type = "last" keeps the last value observations for testing and derives the initial training window from the remaining sample.

  • type = "prob" uses floor(value * n_total) observations as the initial training window.

The argument n_skip controls how far the rolling origin moves between consecutive splits. For non-overlapping test windows, use n_skip = n_ahead - 1.

Value

A tibble containing the split plan. The output has one row per time series and split, with list-columns train and test containing integer row positions.

See Also

Other time series cross-validation: make_future(), make_tsibble(), slice_test(), slice_train(), split_index()

Examples

library(dplyr)

context <- list(
  series_id = "series",
  value_id = "value",
  index_id = "index"
)

main_frame <- M4_monthly_data |>
  filter(series == "M23100")

# Fixed-window split plan
fixed_split <- make_split(
  main_frame = main_frame,
  context = context,
  type = "first",
  value = 120,
  n_ahead = 18,
  n_skip = 17,
  n_lag = 0,
  mode = "slide",
  exceed = FALSE
)

fixed_split

# Expanding-window split plan
expanding_split <- make_split(
  main_frame = main_frame,
  context = context,
  type = "first",
  value = 120,
  n_ahead = 18,
  n_skip = 17,
  n_lag = 0,
  mode = "stretch",
  exceed = FALSE
)

expanding_split

tscv documentation built on May 13, 2026, 9:07 a.m.