internal_workflow: A learning and prediction workflow with internal validation
In mrfoliveira/STResampling-JDSA2020: Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting

A learning and prediction workflow that may deal with NAs and use internal validation to parametrize a re-sampling technique to balance an imbalanced regression problem.

internal_workflow(
  train,
  test,
  form,
  model,
  time,
  site_id,
  resample.grid,
  resample.pars = NULL,
  internal.est = NULL,
  internal.est.pars = NULL,
  internal.evaluator = "int_util_evaluate",
  internal.eval.pars = NULL,
  metrics = c("F1.u", "rmse_phi"),
  metrics.max = c(TRUE, FALSE),
  stat = "MED",
  handleNAs = "centralImputNAs",
  min_train = 2,
  nORp = 0.2,
  .int_parallel = FALSE,
  .intRes = TRUE,
  .full_intRes = FALSE,
  ...
)

`train`	a data frame for training
`test`	a data frame for testing
`form`	a formula describing the model to learn
`model`	the name of the algorithm to use
`time`	the name of the column in `train` and `test` containing time-stamps
`site_id`	the name of the column in `train` and `test` containing location IDs
`resample.grid`	a data.frame with columns indicating resample.pars to test using internal.est. Any NA value in resample.grid will have the argument set to NULL.
`resample.pars`	parameters to be passed to re-sample function. Default is NULL.
`internal.est`	character string identifying the internal estimator function to use
`internal.est.pars`	named list of internal estimator parameters (e.g., tr.perc or nfolds)
`internal.evaluator`	character string indicating internal evaluation function
`internal.eval.pars`	named list of parameters to feed to internal evaluation function
`metrics`	vector of names of two metrics to be used to determine the best parametrization (the second metric is only used in case of ties)
`metrics.max`	vector of Booleans indicating whether each metric in parameter metrics should be maximized (TRUE) or minimized (FALsE) for best results
`stat`	parameter indicating summary statistic that should be used to determine the best internal evaluation metric: "MED" (for median) or "MEAN" (for mean)
`handleNAs`	string indicating how to deal with NAs. If "centralImputNAs", training observations with at least 80% of non-NA columns, will have their NAs substituted by the mean value and testing observatiosn will have their NAs filled in with mean value regardless. Default is NULL.
`min_train`	a minimum number of observations that must be left to train a model. If there are not enough observations, predictions will be `NA`. Default is 2.
`nORp`	a maximum number or fraction of columns/rows with missing values above which a row/column will be removed from train before learning the model. Only works if `handleNAs` was set to centralImputNAs. Default is 0.2.
`.int_parallel`	a Boolean indicating whether rows in the grid search should be tested in parallel
`.intRes`	a Boolean indicating whether the evalRes object outputed by internal validation should be returned. Defaults to TRUE
`.full_intRes`	a Boolean indicating whether the full results object for internal validation should be returned as well. Defaults to FALSE
`...`	other parameters to feed to `model`

a data frame containing time-stamps, location IDs, true values and predicted values

mrfoliveira/STResampling-JDSA2020 documentation built on June 28, 2021, 7:01 p.m.

mrfoliveira/STResampling-JDSA2020 index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mrfoliveira/STResampling-JDSA2020
Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting

internal_workflow: A learning and prediction workflow with internal validation
In mrfoliveira/STResampling-JDSA2020: Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting

Description

Usage

Arguments

Value

Related to internal_workflow in mrfoliveira/STResampling-JDSA2020...

R Package Documentation

Browse R Packages

We want your feedback!

mrfoliveira/STResampling-JDSA2020 Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting

internal_workflow: A learning and prediction workflow with internal validation In mrfoliveira/STResampling-JDSA2020: Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting

Description

Usage

Arguments

Value

Related to internal_workflow in mrfoliveira/STResampling-JDSA2020...

R Package Documentation

Browse R Packages

We want your feedback!

mrfoliveira/STResampling-JDSA2020
Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting

internal_workflow: A learning and prediction workflow with internal validation
In mrfoliveira/STResampling-JDSA2020: Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting