prequential_eval: Prequential evaluation

Description Usage Arguments Value

View source: R/methods.R

Description

Performs an evaluation procedure where training and test sets can be allocated in different ways, while always respecting the ordering provided by time (models are trained in the past and tested in the relative future).

Usage

1
2
3
4
prequential_eval(data, nfolds, FUN, form, window = "growing",
  fold.alloc.proc = "Tblock_SPall", alloc.pars = NULL,
  removeSP = FALSE, time = "time", site_id = "site",
  .keepTrain = TRUE, ...)

Arguments

data

full dataset

nfolds

number of folds for the data set to be separated into.
If you would like to set the number of time and space folds separately, nfolds should be set to NULL and t.nfolds and sp.nfolds should be fed as a list to alloc.pars (only available when using fold.alloc.proc set to Tblock_SPchecker, Tblock_SPcontig or Tblock_SPrand).

FUN

function with arguments

  • train training set

  • test testing set

  • time column name of time-stamps

  • site_id column name of location identifiers

  • form a formula for model learning

  • ... other arguments

form

a formula for model learning

window

type of blocked-time window ordering considered. Should be one of

  • growing - for each time block being tested, all previous time blocks are used for training

  • sliding - for each time block being tested, the immediately previous time blocks are used for training

fold.alloc.proc

name of fold allocation function. Should be one of

  • Tblock_SPall - each fold includes a block of contiguous time for all locations

  • Tblock_SPchecker - each fold includes a block of contiguous time for a systematically assigned (checkered) part of space

  • Tblock_SPcontig - each fold includes a block of contiguous time for a block of spatially contiguous locations

  • Tblock_SPrand - each fold includes a block of contiguous time for a randomly assigned part of space

alloc.pars

parameters to pass onto fold.alloc.proc

removeSP

argument that determines whether spatio-temporal blocks including the space being used for testing should be removed from the training set. Default is FALSE, meaning the information is not removed

time

column name of time-stamp in data. Default is "time"

site_id

column name of location identifier in data. Default is "site_id"

.keepTrain

if TRUE (default), instead of the results of FUN being directly returned, a list is created with both the results and a data.frame with the time and site identifiers of the observations used in the training step.

...

other arguments to FUN

Value

If keepTrain is TRUE, a list where each slot corresponds to one repetition or fold, containing a list with slots results containing the results of FUN, and train containing a data.frame with the time and site_id identifiers of the observations used in the training step. Usually, the results of FUN is a data.frame with location identifier site_id, time-stamp time, true values trues and the workflow's predictions preds.


mrfoliveira/Evaluation-procedures-for-forecasting-with-spatio-temporal-data documentation built on April 11, 2021, 10:50 a.m.