View source: R/ensemble_model_spec.R
ensemble_model_spec | R Documentation |
A 2-stage stacking regressor that follows:
Stage 1: Sub-Model's are Trained & Predicted using modeltime.resample::modeltime_fit_resamples()
.
Stage 2: A Meta-learner (model_spec
) is trained on Out-of-Sample Sub-Model
Predictions using ensemble_model_spec()
.
ensemble_model_spec(
object,
model_spec,
kfolds = 5,
param_info = NULL,
grid = 6,
control = control_grid()
)
object |
A Modeltime Table. Used for ensemble sub-models. |
model_spec |
A Can be either:
|
kfolds |
K-Fold Cross Validation for tuning the Meta-Learner.
Controls the number of folds used in the meta-learner's cross-validation.
Gets passed to |
param_info |
A |
grid |
Grid specification or grid size for tuning the Meta Learner.
Gets passed to |
control |
An object used to modify the tuning process.
Uses |
Stacked Ensemble Process
Start with a Modeltime Table to define your sub-models.
Step 1: Use modeltime.resample::modeltime_fit_resamples()
to perform the submodel resampling procedure.
Step 2: Use ensemble_model_spec()
to define and train the meta-learner.
What goes on inside the Meta Learner?
The Meta-Learner Ensembling Process uses the following basic steps:
Make Cross-Validation Predictions.
Cross validation predictions are made for each sub-model with modeltime.resample::modeltime_fit_resamples()
.
The out-of-sample sub-model predictions contained in .resample_results
are used as the input to the meta-learner.
Train a Stacked Regressor (Meta-Learner).
The sub-model out-of-sample cross validation predictions are then
modeled using a model_spec
with options:
Tuning: If the model_spec
does include tuning parameters via tune::tune()
then the meta-learner will be hypeparameter tuned using K-Fold Cross Validation. The
parameters and grid can adjusted using kfolds
, grid
, and param_info
.
No-Tuning: If the model_spec
does not include tuning parameters via tune::tune()
then the meta-learner will not be hypeparameter tuned and will have the model
fitted to the sub-model predictions.
Final Model Selection.
If tuned, the final model is selected based on RMSE, then retrained on the full set of out of sample predictions.
If not-tuned, the fitted model from Stage 2 is used.
Progress
The best way to follow the training process and watch progress is to use
control = control_grid(verbose = TRUE)
to see progress.
Parallelize
Portions of the process can be parallelized. To parallelize, set
up parallelization using tune
via one of the backends such as
doFuture
. Then set control = control_grid(allow_par = TRUE)
A mdl_time_ensemble
object.
library(tidymodels)
library(modeltime)
library(modeltime.ensemble)
library(dplyr)
library(timetk)
library(glmnet)
# Step 1: Make resample predictions for submodels
resamples_tscv <- training(m750_splits) %>%
time_series_cv(
assess = "2 years",
initial = "5 years",
skip = "2 years",
slice_limit = 1
)
submodel_predictions <- m750_models %>%
modeltime_fit_resamples(
resamples = resamples_tscv,
control = control_resamples(verbose = TRUE)
)
# Step 2: Metalearner ----
# * No Metalearner Tuning
ensemble_fit_lm <- submodel_predictions %>%
ensemble_model_spec(
model_spec = linear_reg() %>% set_engine("lm"),
control = control_grid(verbose = TRUE)
)
ensemble_fit_lm
# * With Metalearner Tuning ----
ensemble_fit_glmnet <- submodel_predictions %>%
ensemble_model_spec(
model_spec = linear_reg(
penalty = tune(),
mixture = tune()
) %>%
set_engine("glmnet"),
grid = 2,
control = control_grid(verbose = TRUE)
)
ensemble_fit_glmnet
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.