View source: R/model_backward.R
| model_backward | R Documentation |
Fits a nonparametric additive model, with simultaneous variable selection through a backward elimination procedure as proposed by Fan and Hyndman (2012).
model_backward(
data,
val.data,
yvar,
neighbour = 0,
family = gaussian(),
s.vars = NULL,
s.basedim = NULL,
linear.vars = NULL,
refit = TRUE,
tol = 0.001,
parallel = FALSE,
workers = NULL,
exclude.trunc = NULL,
recursive = FALSE,
recursive_colRange = NULL,
verbose = FALSE
)
data |
Training data set on which models will be trained. Must be a data
set of class |
val.data |
Validation data set. (The data set on which the model
selection will be performed.) Must be a data set of class |
yvar |
Name of the response variable as a character string. |
neighbour |
If multiple models are fitted: Number of neighbours of each
key (i.e. grouping variable) to be considered in model fitting to handle
smoothing over the key. Should be an |
family |
A description of the error distribution and link function to be
used in the model (see |
s.vars |
A |
s.basedim |
Dimension of the bases used to represent the smooth terms
corresponding to |
linear.vars |
A |
refit |
Whether to refit the model combining training and validation
sets after model selection. If |
tol |
Tolerance for the ratio of relative change in validation set MSE, used in model selection. |
parallel |
Whether to use parallel computing in model selection or not. |
workers |
If |
exclude.trunc |
The names of the predictor variables that should not be
truncated for stable predictions as a character string. (Since the
nonlinear functions are estimated using splines, extrapolation is not
desirable. Hence, if any predictor variable in |
recursive |
Whether to obtain recursive forecasts or not (default -
|
recursive_colRange |
If |
verbose |
Logical; controls whether progress messages (model indices) are printed during fitting. Defaults to FALSE. |
This function fits a nonparametric additive model formulated through Backward Elimination, as proposed by Fan and Hyndman (2012). The process starts with all predictors included in an additive model, and predictors are progressively omitted until the best model is obtained based on the validation set. Once the best model is obtained, the final model is re-fitted for the data set combining training and validation sets. For more details see reference.
An object of class backward. This is a tibble with two
columns:
key |
The level of the grouping variable (i.e. key of the training data set). |
fit |
Information of the fitted model
corresponding to the |
Each row of the column fit is an
object of class gam. For details refer mgcv::gamObject.
Fan, S. & Hyndman, R.J. (2012). Short-Term Load Forecasting Based on a Semi-Parametric Additive Model. IEEE Transactions on Power Systems, 27(1), 134-141.\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1109/TPWRS.2011.2162082")}.
model_smimodel, model_gaim,
model_ppr, model_gam, model_lm
library(dplyr)
library(tibble)
library(tidyr)
library(tsibble)
# Simulate data
n = 1205
set.seed(123)
sim_data <- tibble(x_lag_000 = runif(n)) |>
mutate(
# Add x_lags
x_lag = lag_matrix(x_lag_000, 5)) |>
unpack(x_lag, names_sep = "_") |>
mutate(
# Response variable
y = (0.9*x_lag_000 + 0.6*x_lag_001 + 0.45*x_lag_003)^3 + rnorm(n, sd = 0.1),
# Add an index to the data set
inddd = seq(1, n)) |>
drop_na() |>
select(inddd, y, starts_with("x_lag")) |>
# Make the data set a `tsibble`
as_tsibble(index = inddd)
# Training set
sim_train <- sim_data[1:1000, ]
# Validation set
sim_val <- sim_data[1001:1200, ]
# Predictors taken as non-linear variables
s.vars <- colnames(sim_data)[3:8]
# Model fitting
backwardModel <- model_backward(data = sim_train,
val.data = sim_val,
yvar = "y",
s.vars = s.vars)
# Fitted model
backwardModel$fit[[1]]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.