simple_workflow: A simple learning and prediction workflow

Description Usage Arguments Value

View source: R/new_workflows.R

Description

A simple learning and prediction workflow that may deal with NAs and use re-sampling techniques to balance an imbalanced regression problem.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
simple_workflow(
  train,
  test,
  form,
  model,
  time,
  site_id,
  resample.pars = NULL,
  handleNAs = "centralImputNAs",
  min_train = 2,
  nORp = 0.2,
  ...
)

Arguments

train

a data frame for training

test

a data frame for testing

form

a formula describing the model to learn

model

the name of the algorithm to use

time

the name of the column in train and test containing time-stamps

site_id

the name of the column in train and test containing location IDs

resample.pars

parameters to be passed to re-sample function. Default is NULL.

handleNAs

string indicating how to deal with NAs. If "centralImputNAs", training observations with at least 80% of non-NA columns, will have their NAs substituted by the mean value and testing observatiosn will have their NAs filled in with mean value regardless. Default is NULL.

min_train

a minimum number of observations that must be left to train a model. If there are not enough observations, predictions will be NA. Default is 2.

nORp

a maximum number or fraction of columns/rows with missing values above which a row/column will be removed from train before learning the model. Only works if handleNAs was set to centralImputNAs. Default is 0.2.

...

other parameters to feed to model

Value

a data frame containing time-stamps, location IDs, true values and predicted values


mrfoliveira/STResampling-JDSA2020 documentation built on June 28, 2021, 7:01 p.m.