prequential_eval: Prequential evaluation

Description Usage Arguments Value

View source: R/eval_framework.R

Description

Performs an evaluation procedure where training and test sets can be allocated in different ways, while always respecting the ordering provided by time (models are trained in the past and tested in the relative future).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
prequential_eval(
  data,
  nfolds,
  FUN,
  form,
  window = "growing",
  fold.alloc.proc = "Tblock_SPall",
  alloc.pars = NULL,
  removeSP = FALSE,
  init_fold = 2,
  time = "time",
  site_id = "site",
  .keepTrain = TRUE,
  .parallel = TRUE,
  .verbose = ifelse(.parallel, FALSE, TRUE),
  ...
)

Arguments

data

full dataset

nfolds

number of folds for the data set to be separated into. If you would like to set the number of time and space folds separately, nfolds should be set to NULL and t.nfolds and sp.nfolds should be fed as a list to alloc.pars (only available when using fold.alloc.proc Tblock_SPrand)

FUN

function with arguments

  • train training set

  • test testing set

  • time column name of time-stamps

  • site_id column name of location identifiers

  • form a formula for model learning

  • ... other arguments

form

a formula for model learning

window

type of blocked-time window ordering considered. Should be one of

  • growing - for each time block being tested, all previous time blocks are used for training

  • sliding - for each time block being tested, the immediately previous time blocks are used for training

fold.alloc.proc

name of fold allocation function. Should be one of

  • Tblock_SPall - each fold includes a block of contiguous time for all locations

  • Tblock_SPchecker - each fold includes a block of contiguous time for a systematically assigned (checkered) part of space

  • Tblock_SPcontig - each fold includes a block of contiguous time for a block of spatially contiguous locations

  • Tblock_SPrand - each fold includes a block of contiguous time for a randomly assigned part of space

alloc.pars

parameters to pass onto fold.alloc.proc

removeSP

argument that determines whether spatio-temporal blocks including the space being used for testing should be removed from the training set. Default is FALSE, meaning the information is not removed

init_fold

first temporal fold to use for testing. Default is 2.

time

column name of time-stamp in data. Default is "time"

site_id

column name of location identifier in data. Default is "site_id"

.keepTrain

if TRUE (default), instead of the results of FUN being directly returned, a list is created with both the results and a data.frame with the time and site identifiers of the observations used in the training step.

.parallel

Boolean indicating whether each block should be run in parallel

.verbose

Boolean indicating whether updates on progress should be printed

...

other arguments to FUN

Value

The results of FUN. Usually, a data.frame with location identifier site_id, time-stamp time, true values trues and the workflow's predictions preds.


mrfoliveira/STResampling-JDSA2020 documentation built on June 28, 2021, 7:01 p.m.