ensemble_model_spec: Creates a Stacked Ensemble Model from a Model Spec

Description Usage Arguments Details Value Examples

View source: R/ensemble_model_spec.R

Description

A 2-stage stacking regressor that follows:

  1. Stage 1: Sub-Model's are Trained & Predicted using modeltime.resample::modeltime_fit_resamples().

  2. Stage 2: A Meta-learner (model_spec) is trained on Out-of-Sample Sub-Model Predictions using ensemble_model_spec().

Usage

1
2
3
4
5
6
7
8
ensemble_model_spec(
  object,
  model_spec,
  kfolds = 5,
  param_info = NULL,
  grid = 6,
  control = control_grid()
)

Arguments

object

A Modeltime Table. Used for ensemble sub-models.

model_spec

A model_spec object defining the meta-learner stacking model specification to be used.

Can be either:

  1. A non-tunable model_spec: Parameters are specified and are not optimized via tuning.

  2. A tunable model_spec: Contains parameters identified for tuning with tune::tune()

kfolds

K-Fold Cross Validation for tuning the Meta-Learner. Controls the number of folds used in the meta-learner's cross-validation. Gets passed to rsample::vfold_cv().

param_info

A dials::parameters() object or NULL. If none is given, a parameters set is derived from other arguments. Passing this argument can be useful when parameter ranges need to be customized.

grid

Grid specification or grid size for tuning the Meta Learner. Gets passed to tune::tune_grid().

control

An object used to modify the tuning process. Uses tune::control_grid() by default. Use control_grid(verbose = TRUE) to follow the training process.

Details

Stacked Ensemble Process

What goes on inside the Meta Learner?

The Meta-Learner Ensembling Process uses the following basic steps:

  1. Make Cross-Validation Predictions. Cross validation predictions are made for each sub-model with modeltime_fit_resamples(). The out-of-sample sub-model predictions contained in .resample_results are used as the input to the meta-learner.

  2. Train a Stacked Regressor (Meta-Learner). The sub-model out-of-sample cross validation predictions are then modeled using a model_spec with options:

    • Tuning: If the model_spec does include tuning parameters via tune::tune() then the meta-learner will be hypeparameter tuned using K-Fold Cross Validation. The parameters and grid can adjusted using kfolds, grid, and param_info.

    • No-Tuning: If the model_spec does not include tuning parameters via tune::tune() then the meta-learner will not be hypeparameter tuned and will have the model fitted to the sub-model predictions.

  3. Final Model Selection.

    • If tuned, the final model is selected based on RMSE, then retrained on the full set of out of sample predictions.

    • If not-tuned, the fitted model from Stage 2 is used.

Progress

The best way to follow the training process and watch progress is to use control = control_grid(verbose = TRUE) to see progress.

Parallelize

Portions of the process can be parallelized. To parallelize, set up parallelization using tune via one of the backends such as doFuture. Then set control = control_grid(allow_par = TRUE)

Value

A mdl_time_ensemble object.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
library(tidymodels)
library(modeltime)
library(modeltime.ensemble)
library(tidyverse)
library(timetk)

# Step 1: Make resample predictions for submodels
resamples_tscv <- training(m750_splits) %>%
    time_series_cv(
        assess  = "2 years",
        initial = "5 years",
        skip    = "2 years",
        slice_limit = 1
    )

submodel_predictions <- m750_models %>%
    modeltime_fit_resamples(
        resamples = resamples_tscv,
        control   = control_resamples(verbose = TRUE)
    )

# Step 2: Metalearner ----

# * No Metalearner Tuning
ensemble_fit_lm <- submodel_predictions %>%
    ensemble_model_spec(
        model_spec = linear_reg() %>% set_engine("lm"),
        control    = control_grid(verbose = TRUE)
    )

ensemble_fit_lm

# * With Metalearner Tuning ----
ensemble_fit_glmnet <- submodel_predictions %>%
    ensemble_model_spec(
        model_spec = linear_reg(
            penalty = tune(),
            mixture = tune()
        ) %>%
            set_engine("glmnet"),
        grid       = 2,
        control    = control_grid(verbose = TRUE)
    )

ensemble_fit_glmnet

modeltime.ensemble documentation built on Oct. 20, 2021, 1:06 a.m.