crossv_ts: Generate cross-validated time-series test/train sets

Description Usage Arguments Details Value Methods (by class) References

Description

Generate test/train set for time series. Each training set consists of observations before the test set. This is also called "evaluation on a rolling forecasting origin".

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
crossv_ts(data, ...)

## S3 method for class 'data.frame'
crossv_ts(data, horizon = 1L, test_size = 1L,
  test_partial = FALSE, train_partial = TRUE, train_size = n,
  test_start = NULL, from = 1L, to = n, by = 1L, ...)

## S3 method for class 'grouped_df'
crossv_ts(data, horizon = 1L, test_size = 1L,
  test_partial = FALSE, train_partial = TRUE, train_size = n,
  test_start = NULL, from = 1L, to = n, by = 1L, ...)

Arguments

data

A data frame

...

Arguments passed to methods

horizon

Difference between the first test set observation and the last training set observation

test_size

Size of the test set

test_partial

If TRUE, then allows for partial test sets. If FALSE, does not allow for partial test sets. If a number, then it is the minimum allowable size of a test set.

train_partial

Same as test_partial, but for the training set.

train_size

The maximum size of the training set. This allows for a rolling window training set, instead of using all obsservations from the start of the time series.

test_start, from, to, by

An integer vector of the starting index values of the test set. NULL, then these are generated from seq(from, to, by).

Details

In time-series cross-validation the training set only uses observations that are prior to the test set. Suppose the time series has n observations, the training set has a maximum size of r <= n and minimum size of s >= r. and the test set has a maximum size of p <= n and minimum size of q >= p. For indices i \in \{1, …, N\}:

  1. Select observations i, …, \max{p, n} for the test set.

  2. Select observations \max{i - h - p}, …, i - h for the training set.

  3. If the test set has a size of at least q and the training set has a size of at least r.

Value

A data frame with k the following columns:

sample

A list of resample objects. Training sets.

.id

An integer vector of identifiers.

Methods (by class)

References


jrnold/resamplr documentation built on May 20, 2019, 1:05 a.m.