generate_fc_automl_h2o: Automated Machine Learning

Description Usage Arguments Value Examples

Description

Function to apply the h2o.automl function from the h2o package on time series data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
generate_fc_automl_h2o(
  ts_data,
  xreg_data = NULL,
  fc_horizon = 12,
  backtesting_opt = NULL,
  data_dir = NULL,
  prepro_fct = NULL,
  data_transf_method = "diff",
  automl_h2o_arg = NULL,
  time_id = base::Sys.time(),
  nb_threads = 1,
  ...
)

Arguments

ts_data

A univariate 'ts' or 'xts' object

xreg_data

A univariate or multivariate 'ts', 'mts' or 'xts' object, optional external regressors

fc_horizon

An integer, the forecasting horizon (i.e. the number of periods to forecast)

backtesting_opt

A list, options which define the backtesting approach:

use_bt - A boolean, to determine whether forecasts should be generated on future dates (default) or on past values. Generating forecasts on past dates allows to measure past forecast accuracy and to monitor a statistical model's ability to learn signals from the data.

nb_iters - An integer, to determine the number of forecasting operations to apply (When no backtesting is selected, then only one forecasting exercise is performed)

method - A string, to determine whether to apply a 'rolling' (default) or a 'moving' forecasting window. When 'rolling' is selected, after each forecasting exercise, the forecasting interval increments by one period and drops the last period to include it in the new training sample. When 'moving' is selected, the forecasting interval increments by its size rather than one period.

sample_size - A string, to determine whether the training set size should be 'expanding' (default) or 'fixed'. When 'expanding' is selected, then after each forecasting operation, the periods dropped from the forecasting interval will be added to the training set. When 'fixed' is selected, then adding new periods to the training set will require dropping as many last periods to keep the set's size constant.

data_dir

A string, directory to which results can be saved as text files

prepro_fct

A function, a preprocessing function which handles missing values in the data. The default preprocessing function selects the largest interval of non-missing values and then attributes the most recent dates to those values. Other data handling functions can be applied (e.g. timeSeries::na.contiguous, imputeTS::na.mean, custom-developed...).

data_transf_method

A string, the data transformation method to be passed to the function. (available options: 'diff', 'log', 'sqrt')

automl_h2o_arg

A list, optional arguments to pass to the h2o.automl function

time_id

A POSIXct, timestamp created with Sys.time which is then appended to the results

nb_threads

A numeric, number of threads to use in parallel computed model selection process

...

Additional arguments to be passed to the function

Value

A 'tsForecastR' object

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
library(datasets)

# Generate forecasts on future dates
fc <- generate_fc_automl_h2o(AirPassengers,
                             fc_horizon = 12)

# Generate forecasts on past dates to analyze performance
fc <- generate_fc_automl_h2o(AirPassengers,
                             fc_horizon = 12,
                             backtesting_opt = list(use_bt = TRUE))

# Generate forecasts on past dates with multiple iterations and a rolling window
fc <- generate_fc_automl_h2o(AirPassengers,
                             fc_horizon = 6,
                             backtesting_opt = list(use_bt = TRUE,
                                                    nb_iters = 6))

## End(Not run)

xavierkamp/tsForecastR documentation built on Feb. 1, 2020, 10:16 a.m.