Nothing
#' SMI model estimation through a greedy search for penalty parameters
#'
#' Performs a greedy search over a given grid of penalty parameter combinations
#' (lambda0, lambda2), and fits SMI model(s) with best (lowest validation set
#' MSE) penalty parameter combination(s). If the optimal combination lies on the
#' edge of the grid, the penalty parameters are adjusted by ±10%, and a second
#' round of grid search is performed. If a grouping variable is used, penalty
#' parameters are tuned separately for each individual model.
#'
#' @param data Training data set on which models will be trained. Must be a data
#' set of class \code{tsibble}.(Make sure there are no additional date or time
#' related variables except for the \code{index} of the \code{tsibble}). If
#' multiple models are fitted, the grouping variable should be the \code{key}
#' of the \code{tsibble}. If a \code{key} is not specified, a dummy key with
#' only one level will be created.
#' @param val.data Validation data set. (The data set on which the penalty
#' parameter selection will be performed.) Must be a data set of class
#' \code{tsibble}. (Once the penalty parameter selection is completed, the
#' best model will be re-fitted for the combined data set \code{data +
#' val.data}.)
#' @param yvar Name of the response variable as a character string.
#' @param neighbour If multiple models are fitted: Number of neighbours of each
#' key (i.e. grouping variable) to be considered in model fitting to handle
#' smoothing over the key. Should be an \code{integer}. If \code{neighbour =
#' x}, \code{x} number of keys before the key of interest and \code{x} number
#' of keys after the key of interest are grouped together for model fitting.
#' The default is \code{neighbour = 0} (i.e. no neighbours are considered for
#' model fitting).
#' @param family A description of the error distribution and link function to be
#' used in the model (see \code{\link{glm}} and \code{\link{family}}).
#' @param index.vars A \code{character} vector of names of the predictor
#' variables for which indices should be estimated.
#' @param initialise The model structure with which the estimation process
#' should be initialised. The default is \code{"ppr"}, where the initial model
#' is derived from projection pursuit regression. The other options are
#' \code{"additive"} - nonparametric additive model, \code{"linear"} - linear
#' regression model (i.e. a special case single-index model, where the initial
#' values of the index coefficients are obtained through a linear regression),
#' \code{"multiple"} - multiple models are fitted starting with different
#' initial models (number of indices = \code{num_ind}; \code{num_models}
#' random instances of the model (i.e. the predictor assignment to indices and
#' initial index coefficients are generated randomly) are considered), and the
#' final optimal model with the lowest loss is returned, and
#' \code{"userInput"} - user specifies the initial model structure (i.e. the
#' number of indices and the placement of index variables among indices) and
#' the initial index coefficients through \code{index.ind} and
#' \code{index.coefs} arguments respectively.
#' @param num_ind If \code{initialise = "ppr"} or \code{"multiple"}: an
#' \code{integer} that specifies the number of indices to be used in the
#' model(s). The default is \code{num_ind = 5}.
#' @param num_models If \code{initialise = "multiple"}: an \code{integer} that
#' specifies the number of starting models to be checked. The default is
#' \code{num_models = 5}.
#' @param seed If \code{initialise = "multiple"}: the seed to be set when
#' generating random starting points.
#' @param index.ind If \code{initialise = "userInput"}: an \code{integer} vector
#' that assigns group index for each predictor in \code{index.vars}.
#' @param index.coefs If \code{initialise = "userInput"}: a \code{numeric}
#' vector of index coefficients.
#' @param s.vars A \code{character} vector of names of the predictor variables
#' for which splines should be fitted individually (rather than considering as
#' part of an index).
#' @param linear.vars A \code{character} vector of names of the predictor
#' variables that should be included linearly into the model.
#' @param nlambda The number of values for lambda0 (penalty parameter for L0
#' penalty) - default is 100.
#' @param lambda.min.ratio Smallest value for lambda0, as a fraction of
#' lambda0.max (data derived).
#' @param refit Whether to refit the model combining training and validation
#' sets after parameter tuning. If \code{FALSE}, the final model will be
#' estimated only on the training set.
#' @param M Big-M value used in MIP.
#' @param max.iter Maximum number of MIP iterations performed to update index
#' coefficients for a given model.
#' @param tol Tolerance for the objective function value (loss) of MIP.
#' @param tolCoefs Tolerance for coefficients.
#' @param TimeLimit A limit for the total time (in seconds) expended in a single
#' MIP iteration.
#' @param MIPGap Relative MIP optimality gap.
#' @param NonConvex The strategy for handling non-convex quadratic objectives or
#' non-convex quadratic constraints in Gurobi solver.
#' @param verbose A named list controlling verbosity options. Defaults to
#' \code{list(solver = FALSE, progress = FALSE)}.
#' \describe{
#' \item{solver}{Logical. If TRUE, print detailed solver output.}
#' \item{progress}{Logical. If TRUE, print optimisation algorithm progress
#' messages.}
#' }
#' @param parallel The option to use parallel processing in fitting SMI models
#' for different penalty parameter combinations.
#' @param workers If \code{parallel = TRUE}: Number of cores to use.
#' @param exclude.trunc The names of the predictor variables that should not be
#' truncated for stable predictions as a character string. (Since the
#' nonlinear functions are estimated using splines, extrapolation is not
#' desirable. Hence, if any predictor variable in `val.data` that is treated
#' non-linearly in the estimated model, will be truncated to be in the
#' in-sample range before obtaining predictions. If any variables are listed
#' here will be excluded from such truncation.)
#' @param recursive Whether to obtain recursive forecasts or not (default -
#' \code{FALSE}).
#' @param recursive_colRange If \code{recursive = TRUE}, the range of column
#' numbers in \code{val.data} to be filled with forecasts.
#' Recursive/autoregressive forecasting is required when the lags of the
#' response variable itself are used as predictor variables into the model.
#' Make sure such lagged variables are positioned together in increasing lag
#' order (i.e. \code{lag_1, lag_2, ..., lag_m}, \code{lag_m =} maximum lag
#' used) in \code{val.data}, with no break in the lagged variable sequence
#' even if some of the intermediate lags are not used as predictors.
#' @return An object of class \code{smimodel}. This is a \code{tibble} with two
#' columns: \item{key}{The level of the grouping variable (i.e. key of the
#' training data set).} \item{fit}{Information of the fitted model
#' corresponding to the \code{key}.}
#' Each row of the column \code{fit} contains a list with six elements:
#' \item{initial}{A list of information of the model initialisation. (For
#' descriptions of the list elements see \code{\link{make_smimodelFit}}).}
#' \item{best}{A list of information of the final optimised model. (For
#' descriptions of the list elements see \code{\link{make_smimodelFit}}).}
#' \item{best_lambdas}{Selected penalty parameter combination.}
#' \item{lambda0_seq}{Sequence of values for lambda0 used to construct the
#' initial grid.} \item{lambda2_seq}{Sequence of
#' values for lambda2 used to construct the initial grid.}
#' \item{searched}{A \code{tibble} containing the penalty parameter
#' combinations searched during the two-step greedy search and the
#' corresponding validation set MSEs.} The number of
#' rows of the \code{tibble} equals to the number of levels in the grouping
#' variable.
#'
#' @seealso \code{\link{model_smimodel}}
#'
#' @examples
#' \donttest{
#' if(requireNamespace("gurobi", quietly = TRUE)){
#' library(dplyr)
#' library(ROI)
#' library(tibble)
#' library(tidyr)
#' library(tsibble)
#'
#' # Simulate data
#' n = 1205
#' set.seed(123)
#' sim_data <- tibble(x_lag_000 = runif(n)) |>
#' mutate(
#' # Add x_lags
#' x_lag = lag_matrix(x_lag_000, 5)) |>
#' unpack(x_lag, names_sep = "_") |>
#' mutate(
#' # Response variable
#' y = (0.9*x_lag_000 + 0.6*x_lag_001 + 0.45*x_lag_003)^3 + rnorm(n, sd = 0.1),
#' # Add an index to the data set
#' inddd = seq(1, n)) |>
#' drop_na() |>
#' select(inddd, y, starts_with("x_lag")) |>
#' # Make the data set a `tsibble`
#' as_tsibble(index = inddd)
#'
#' # Training set
#' sim_train <- sim_data[1:1000, ]
#' # Validation set
#' sim_val <- sim_data[1001:1200, ]
#'
#' # Index variables
#' index.vars <- colnames(sim_data)[3:8]
#'
#' # Model fitting
#' smi_greedy <- greedy_smimodel(data = sim_train,
#' val.data = sim_val,
#' yvar = "y",
#' index.vars = index.vars,
#' initialise = "ppr",
#' lambda.min.ratio = 0.1)
#'
#' # Best (optimised) fitted model
#' smi_greedy$fit[[1]]$best
#'
#' # Selected penalty parameter combination
#' smi_greedy$fit[[1]]$best_lambdas
#' }
#' }
#'
#' @export
greedy_smimodel <- function(data, val.data, yvar, neighbour = 0,
family = gaussian(), index.vars,
initialise = c("ppr", "additive", "linear",
"multiple", "userInput"),
num_ind = 5, num_models = 5, seed = 123,
index.ind = NULL, index.coefs = NULL,
s.vars = NULL, linear.vars = NULL,
nlambda = 100, lambda.min.ratio = 0.0001,
refit = TRUE, M = 10, max.iter = 50,
tol = 0.001, tolCoefs = 0.001,
TimeLimit = Inf, MIPGap = 1e-4, NonConvex = -1,
verbose = list(solver = FALSE, progress = FALSE),
parallel = FALSE, workers = NULL,
exclude.trunc = NULL,
recursive = FALSE, recursive_colRange = NULL){
# Message for gurobi installation
message("Do you have Gurobi solver installed?
Make sure you have an active installation of Gurobi solver (https://www.gurobi.com/)
in your local machine before using this function.
Refer the section 'Other Required Software' in the README for installation help.")
verbose_default <- list(solver = FALSE, progress = FALSE)
verbose <- modifyList(verbose_default, verbose)
# Check for `tsibble`
stopifnot(tsibble::is_tsibble(data))
stopifnot(tsibble::is_tsibble(val.data))
initialise <- match.arg(initialise)
# Data Preparation
data1 <- data
data2 <- val.data
data_index <- index(data1)
if (length(key(data1)) == 0) {
data1 <- data1 |>
mutate(dummy_key = rep(1, NROW(data1))) |>
as_tsibble(index = data_index, key = dummy_key)
data2 <- data2 |>
mutate(dummy_key = rep(1, NROW(data2))) |>
as_tsibble(index = data_index, key = dummy_key)
}
key11 <- key(data1)[[1]]
key_unique <- unique(as.character(sort(dplyr::pull((data1[, {{ key11 }}])[, 1]))))
key_num <- seq_along(key_unique)
ref <- data.frame(key_unique, key_num)
data1 <- data1 |>
dplyr::mutate(
num_key = as.numeric(factor(as.character({{ key11 }}), levels = key_unique))
)
data2 <- data2 |>
dplyr::mutate(
num_key = as.numeric(factor(as.character({{ key11 }}), levels = key_unique))
)
smimodels_list <- vector(mode = "list", length = NROW(ref))
for (i in seq_along(ref$key_num)){
if(verbose$progress)
print(paste0('model ', paste0(i)))
df_cat <- data1 |>
dplyr::filter((abs(num_key - ref$key_num[i]) <= neighbour) |
(abs(num_key - ref$key_num[i] + NROW(ref)) <= neighbour) |
(abs(num_key - ref$key_num[i] - NROW(ref)) <= neighbour))
df_cat <- df_cat |>
drop_na()
df_cat_val <- data2 |>
dplyr::filter((abs(num_key - ref$key_num[i]) <= neighbour) |
(abs(num_key - ref$key_num[i] + NROW(ref)) <= neighbour) |
(abs(num_key - ref$key_num[i] - NROW(ref)) <= neighbour))
smimodels_list[[i]] <- greedy.fit(data = df_cat, val.data = df_cat_val,
yvar = yvar,
neighbour = neighbour,
family = family, index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed, index.ind = index.ind,
index.coefs = index.coefs, s.vars = s.vars,
linear.vars = linear.vars,
nlambda = nlambda,
lambda.min.ratio = lambda.min.ratio,
refit = refit,
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose,
parallel = parallel, workers = workers,
exclude.trunc = exclude.trunc,
recursive = recursive,
recursive_colRange = recursive_colRange)
}
data_list <- list(key_unique, smimodels_list)
models <- tibble::as_tibble(
x = data_list, .rows = length(data_list[[1]]),
.name_repair = ~ make.names(names = c("key", "fit"))
)
class(models) <- c("smimodel", "tbl_df", "tbl", "data.frame")
return(models)
}
utils::globalVariables(c("dummy_key", "num_key"))
#' Greedy search for tuning penalty parameters
#'
#' Function to perform a greedy search over a given grid of penalty parameter
#' combinations (lambda0, lambda2), and fits a single SMI model with the best
#' (lowest validation set MSE) penalty parameter combination. If the optimal
#' combination lies on the edge of the grid, the penalty parameters are adjusted
#' by ±10%, and a second round of grid search is performed.This is a helper
#' function designed to be called from \code{\link{greedy_smimodel}}.
#'
#' @param data Training data set on which models will be trained. Must be a data
#' set of class \code{tsibble}.(Make sure there are no additional date or time
#' related variables except for the \code{index} of the \code{tsibble}).
#' @param val.data Validation data set. (The data set on which the penalty
#' parameter selection will be performed.) Must be a data set of class
#' \code{tsibble}. (Once the penalty parameter selection is completed, the
#' best model will be re-fitted for the combined data set \code{data +
#' val.data}.)
#' @param yvar Name of the response variable as a character string.
#' @param neighbour `neighbour` argument passed from the outer function.
#' @param family A description of the error distribution and link function to be
#' used in the model (see \code{\link{glm}} and \code{\link{family}}).
#' @param index.vars A \code{character} vector of names of the predictor
#' variables for which indices should be estimated.
#' @param initialise The model structure with which the estimation process
#' should be initialised. The default is \code{"ppr"}, where the initial model
#' is derived from projection pursuit regression. The other options are
#' \code{"additive"} - nonparametric additive model, \code{"linear"} - linear
#' regression model (i.e. a special case single-index model, where the initial
#' values of the index coefficients are obtained through a linear regression),
#' \code{"multiple"} - multiple models are fitted starting with different
#' initial models (number of indices = \code{num_ind}; \code{num_models}
#' random instances of the model (i.e. the predictor assignment to indices and
#' initial index coefficients are generated randomly) are considered), and the
#' final optimal model with the lowest loss is returned, and
#' \code{"userInput"} - user specifies the initial model structure (i.e. the
#' number of indices and the placement of index variables among indices) and
#' the initial index coefficients through \code{index.ind} and
#' \code{index.coefs} arguments respectively.
#' @param num_ind If \code{initialise = "ppr"} or \code{"multiple"}: an
#' \code{integer} that specifies the number of indices to be used in the
#' model(s). The default is \code{num_ind = 5}.
#' @param num_models If \code{initialise = "multiple"}: an \code{integer} that
#' specifies the number of starting models to be checked. The default is
#' \code{num_models = 5}.
#' @param seed If \code{initialise = "multiple"}: the seed to be set when
#' generating random starting points.
#' @param index.ind If \code{initialise = "userInput"}: an \code{integer} vector
#' that assigns group index for each predictor in \code{index.vars}.
#' @param index.coefs If \code{initialise = "userInput"}: a \code{numeric}
#' vector of index coefficients.
#' @param s.vars A \code{character} vector of names of the predictor variables
#' for which splines should be fitted individually (rather than considering as
#' part of an index).
#' @param linear.vars A \code{character} vector of names of the predictor
#' variables that should be included linearly into the model.
#' @param nlambda The number of values for lambda0 (penalty parameter for L0
#' penalty) - default is 100.
#' @param lambda.min.ratio Smallest value for lambda0, as a fraction of
#' lambda0.max (data derived).
#' @param refit Whether to refit the model combining training and validation
#' sets after parameter tuning. If \code{FALSE}, the final model will be
#' estimated only on the training set.
#' @param M Big-M value used in MIP.
#' @param max.iter Maximum number of MIP iterations performed to update index
#' coefficients for a given model.
#' @param tol Tolerance for the objective function value (loss) of MIP.
#' @param tolCoefs Tolerance for coefficients.
#' @param TimeLimit A limit for the total time (in seconds) expended in a single
#' MIP iteration.
#' @param MIPGap Relative MIP optimality gap.
#' @param NonConvex The strategy for handling non-convex quadratic objectives or
#' non-convex quadratic constraints in Gurobi solver.
#' @param verbose A named list controlling verbosity options. Defaults to
#' \code{list(solver = FALSE, progress = FALSE)}.
#' \describe{
#' \item{solver}{Logical. If TRUE, print detailed solver output.}
#' \item{progress}{Logical. If TRUE, print optimisation algorithm progress
#' messages.}
#' }
#' @param parallel The option to use parallel processing in fitting SMI models
#' for different penalty parameter combinations.
#' @param workers If \code{parallel = TRUE}: Number of cores to use.
#' @param exclude.trunc The names of the predictor variables that should not be
#' truncated for stable predictions as a character string. (Since the
#' nonlinear functions are estimated using splines, extrapolation is not
#' desirable. Hence, if any predictor variable in `val.data` that is treated
#' non-linearly in the estimated model, will be truncated to be in the
#' in-sample range before obtaining predictions. If any variables are listed
#' here will be excluded from such truncation.)
#' @param recursive Whether to obtain recursive forecasts or not (default -
#' \code{FALSE}).
#' @param recursive_colRange If \code{recursive = TRUE}, the range of column
#' numbers in \code{val.data} to be filled with forecasts.
#' Recursive/autoregressive forecasting is required when the lags of the
#' response variable itself are used as predictor variables into the model.
#' Make sure such lagged variables are positioned together in increasing lag
#' order (i.e. \code{lag_1, lag_2, ..., lag_m}, \code{lag_m =} maximum lag
#' used) in \code{val.data}, with no break in the lagged variable sequence
#' even if some of the intermediate lags are not used as predictors.
#' @return A list that contains six elements: \item{initial}{A list of
#' information of the model initialisation. (For descriptions of the list
#' elements see \code{\link{make_smimodelFit}}).} \item{best}{A list of
#' information of the final optimised model. (For descriptions of the list
#' elements see \code{\link{make_smimodelFit}}).} \item{best_lambdas}{Selected
#' penalty parameter combination.} \item{lambda0_seq}{Sequence of values for
#' lambda0 used to construct the initial grid.} \item{lambda2_seq}{Sequence of
#' values for lambda2 used to construct the initial grid.}
#' \item{searched}{A \code{tibble} containing the penalty parameter
#' combinations searched during the two-step greedy search and the
#' corresponding validation set MSEs.}
greedy.fit <- function(data, val.data, yvar, neighbour = 0,
family = gaussian(), index.vars,
initialise = c("ppr", "additive", "linear",
"multiple", "userInput"),
num_ind = 5, num_models = 5, seed = 123, index.ind = NULL,
index.coefs = NULL, s.vars = NULL, linear.vars = NULL,
nlambda = 100, lambda.min.ratio = 0.0001,
refit = TRUE, M = 10, max.iter = 50,
tol = 0.001, tolCoefs = 0.001,
TimeLimit = Inf, MIPGap = 1e-4, NonConvex = -1,
verbose = list(solver = FALSE, progress = FALSE),
parallel = FALSE, workers = NULL,
exclude.trunc = NULL,
recursive = FALSE, recursive_colRange = NULL){
verbose_default <- list(solver = FALSE, progress = FALSE)
verbose <- modifyList(verbose_default, verbose)
## Calculating lambda0.max based on the scale of the first term in the
## SMI modelling objective function
# Fit an additive model (gam) as a benchmark
bench <- model_gam(data = data,
yvar = yvar,
family = family,
s.vars = c(index.vars, s.vars),
linear.vars = linear.vars)
# Residuals
bench_resid <- augment(bench)$.resid
# lambda0_seq
lambda0_max <- sum(bench_resid^2)
lambda0_min <- lambda0_max * lambda.min.ratio
lambda0_seq <- c(0, exp(seq(log(lambda0_min), log(lambda0_max), length.out = nlambda)))
# lambda2_seq - a sequence of power of tens
# lambda2.max is taken as the power of ten that matches the scale of lambda0.max
lambda2_max <- round(log10(abs(lambda0_max)))
lambda2_seq <- c(0, 10^seq(-2, lambda2_max, by = 1))
l0_len <- length(lambda0_seq)
l2_len <- length(lambda2_seq)
# Data frame for storing all combinations searched
all_comb <- data.frame()
# Vector for storing validation set MSEs of all combinations searched
all_comb_mse <- numeric()
# Select map function
if(parallel){
future::plan("future::multisession", workers = workers)
map_f <- furrr::future_map
} else {
map_f <- purrr::map
}
# Starting point options
start_l0 <- c(lambda0_seq[1], lambda0_seq[ceiling(l0_len / 2)], lambda0_seq[l0_len])
start_l2 <- c(lambda2_seq[1], lambda2_seq[ceiling(l2_len / 2)], lambda2_seq[l2_len])
lambda_comb_start <- expand.grid(start_l0, start_l2)
# Model fitting for each possible starting point
MSE_list_start <- seq(1, NROW(lambda_comb_start), by = 1) |>
map_f(~ tune_smimodel(data = data, val.data = val.data, yvar = yvar,
neighbour = neighbour,
family = family,
index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed,
index.ind = index.ind,
index.coefs = index.coefs,
s.vars = s.vars,
linear.vars = linear.vars,
lambda.comb = as.numeric(lambda_comb_start[., ]),
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose,
exclude.trunc = exclude.trunc,
recursive = recursive,
recursive_colRange = recursive_colRange))
if(verbose$progress)
print("Potential starting points completed.")
# Updating searched combinations store
all_comb <- bind_rows(all_comb, lambda_comb_start)
# Updating searched combinations MSE
all_comb_mse <- c(all_comb_mse, unlist(MSE_list_start))
# Search points around the starting points
min_MSE_list <- vector(mode = "list", length = NROW(lambda_comb_start))
min_lambdas_list <- vector(mode = "list", length = NROW(lambda_comb_start))
for(i in 1:NROW(lambda_comb_start)){
current_MSE <- MSE_list_start[[i]]
current_lambdas <- as.numeric(lambda_comb_start[i, ])
# Constructing new search space
# lambda0
l0_indx <- which(lambda0_seq == current_lambdas[1])
if(l0_indx < l0_len){
lambda0_seq_new <- c(lambda0_seq[l0_indx - 1], current_lambdas[1],
lambda0_seq[l0_indx + 1])
}else{
lambda0_seq_new <- c(lambda0_seq[l0_indx - 1], current_lambdas[1])
}
# lambda2
l2_indx <- which(lambda2_seq == current_lambdas[2])
if(l2_indx < l2_len){
lambda2_seq_new <- c(lambda2_seq[l2_indx - 1], current_lambdas[2],
lambda2_seq[l2_indx + 1])
}else{
lambda2_seq_new <- c(lambda2_seq[l2_indx - 1], current_lambdas[2])
}
lambda_comb_new <- expand.grid(lambda0_seq_new, lambda2_seq_new)
# Remove already searched combinations
lambda_exist <- do.call(paste0, lambda_comb_new) %in% do.call(paste0, all_comb)
lambda_comb <- lambda_comb_new[lambda_exist == FALSE, ]
if(NROW(lambda_comb) == 0){
min_MSE_list[[i]] <- current_MSE
min_lambdas_list[[i]] <- current_lambdas
}else{
MSE_list <- seq(1, NROW(lambda_comb), by = 1) |>
map_f(~ tune_smimodel(data = data, val.data = val.data, yvar = yvar,
neighbour = neighbour,
family = family,
index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed,
index.ind = index.ind,
index.coefs = index.coefs,
s.vars = s.vars,
linear.vars = linear.vars,
lambda.comb = as.numeric(lambda_comb[., ]),
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose,
exclude.trunc = exclude.trunc,
recursive = recursive,
recursive_colRange = recursive_colRange))
# Store the point that gives minimum validation set MSE
min_lambda_pos <- which.min(unlist(MSE_list))
min_MSE_temp <- min(unlist(MSE_list))
if(min_MSE_temp < current_MSE){
min_MSE_list[[i]] <- min_MSE_temp
min_lambdas_list[[i]] <- as.numeric(lambda_comb[min_lambda_pos, ])
}else{
min_MSE_list[[i]] <- current_MSE
min_lambdas_list[[i]] <- current_lambdas
}
if(verbose$progress)
print(paste("Initial search around potential starting point", i, "completed."))
# Updating searched combinations store
all_comb <- bind_rows(all_comb, lambda_comb)
# Updating searched combinations MSE
all_comb_mse <- c(all_comb_mse, unlist(MSE_list))
}
}
# Best point and corresponding (minimum) MSE up to now
min_lambda_pos <- which.min(unlist(min_MSE_list))
min_MSE <- min(unlist(min_MSE_list))
min_lambdas <- as.numeric(min_lambdas_list[[min_lambda_pos]])
if(verbose$progress)
print("Starting point for the greedy search selected.")
# Reset current minimum MSE
current_MSE <- Inf
# Reset current best lambdas
current_lambdas <- numeric(length = 2)
# Greedy search - round 1
while(min_MSE < current_MSE){
current_MSE <- min_MSE
current_lambdas <- min_lambdas
# Constructing new search space
# lambda0
l0_indx <- which(lambda0_seq == current_lambdas[1])
if(l0_indx < l0_len){
lambda0_seq_new <- c(lambda0_seq[l0_indx - 1], current_lambdas[1],
lambda0_seq[l0_indx + 1])
}else{
lambda0_seq_new <- c(lambda0_seq[l0_indx - 1], current_lambdas[1])
}
# lambda2
l2_indx <- which(lambda2_seq == current_lambdas[2])
if(l2_indx < l2_len){
lambda2_seq_new <- c(lambda2_seq[l2_indx - 1], current_lambdas[2],
lambda2_seq[l2_indx + 1])
}else{
lambda2_seq_new <- c(lambda2_seq[l2_indx - 1], current_lambdas[2])
}
lambda_comb_new <- expand.grid(lambda0_seq_new, lambda2_seq_new)
# Remove already searched combinations
lambda_exist <- do.call(paste0, lambda_comb_new) %in% do.call(paste0, all_comb)
lambda_comb <- lambda_comb_new[lambda_exist == FALSE, ]
if(NROW(lambda_comb) == 0){
break
}else{
MSE_list <- seq(1, NROW(lambda_comb), by = 1) |>
map_f(~ tune_smimodel(data = data, val.data = val.data, yvar = yvar,
neighbour = neighbour,
family = family,
index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed,
index.ind = index.ind,
index.coefs = index.coefs,
s.vars = s.vars,
linear.vars = linear.vars,
lambda.comb = as.numeric(lambda_comb[., ]),
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose,
exclude.trunc = exclude.trunc,
recursive = recursive,
recursive_colRange = recursive_colRange))
# Selecting best point
min_lambda_pos <- which.min(unlist(MSE_list))
min_MSE <- min(unlist(MSE_list))
min_lambdas <- as.numeric(lambda_comb[min_lambda_pos, ])
if(verbose$progress)
print("An iteration of greedy search - step 1 is completed.")
# Updating searched combinations store
all_comb <- bind_rows(all_comb, lambda_comb)
# Updating searched combinations MSE
all_comb_mse <- c(all_comb_mse, unlist(MSE_list))
}
}
# If the selected point is an edge point, expand grid
if(current_lambdas[1] == lambda0_max | current_lambdas[2] == lambda2_max){
# Current minimum MSE - step 2
current_MSE2 <- Inf
# Current best lambdas - step 2
current_lambdas2 <- numeric(length = 2)
# min MSE and min lambdas
min_MSE <- current_MSE
min_lambdas <- current_lambdas
# Greedy search - step 2
while(min_MSE < current_MSE2){
current_MSE2 <- min_MSE
current_lambdas2 <- min_lambdas
# Constructing new search space
# lambda0
if(current_lambdas2[1] == 0){
lambda0_seq_new <- current_lambdas2[1]
}else{
lambda0_seq_new <- c((current_lambdas2[1] - 0.1*current_lambdas2[1]),
current_lambdas2[1],
(current_lambdas2[1] + 0.1*current_lambdas2[1]))
lambda0_seq_new <- lambda0_seq_new[which(lambda0_seq_new >= 0)]
}
# lambda2
if(current_lambdas2[2] == 0){
lambda2_seq_new <- current_lambdas2[2]
}else{
lambda2_seq_new <- c((current_lambdas2[2] - 0.1*current_lambdas2[2]),
current_lambdas2[2],
(current_lambdas2[2] + 0.1*current_lambdas2[2]))
lambda2_seq_new <- lambda2_seq_new[which(lambda2_seq_new >= 0)]
}
lambda_comb_new <- expand.grid(lambda0_seq_new, lambda2_seq_new)
# Remove already searched combinations
lambda_exist <- do.call(paste0, lambda_comb_new) %in% do.call(paste0, all_comb)
lambda_comb <- lambda_comb_new[lambda_exist == FALSE, ]
if(NROW(lambda_comb) == 0){
break
}else{
MSE_list <- seq(1, NROW(lambda_comb), by = 1) |>
map_f(~ tune_smimodel(data = data, val.data = val.data, yvar = yvar,
neighbour = neighbour,
family = family,
index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed,
index.ind = index.ind,
index.coefs = index.coefs,
s.vars = s.vars,
linear.vars = linear.vars,
lambda.comb = as.numeric(lambda_comb[., ]),
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose,
exclude.trunc = exclude.trunc,
recursive = recursive,
recursive_colRange = recursive_colRange))
# Selecting best starting point
min_lambda_pos <- which.min(unlist(MSE_list))
min_MSE <- min(unlist(MSE_list))
min_lambdas <- as.numeric(lambda_comb[min_lambda_pos, ])
if(verbose$progress)
print("An iteration of greedy search - step 2 is completed.")
# Updating searched combinations store
all_comb <- bind_rows(all_comb, lambda_comb)
# Updating searched combinations MSE
all_comb_mse <- c(all_comb_mse, unlist(MSE_list))
}
}
current_MSE <- current_MSE2
current_lambdas <- current_lambdas2
}
# tibble with the information on all searched lambda combinations
all_comb_info <- as_tibble(all_comb) |>
rename(lambda0 = Var1,
lambda2 = Var2) |>
mutate(val_MSE = all_comb_mse)
## Compare with no-index and null models
if(!is.null(s.vars) | !is.null(linear.vars)){
# no-index model
noIndex_model <- model_gam(data = data,
yvar = yvar,
family = family,
s.vars = s.vars,
linear.vars = linear.vars)
noIndex_preds <- predict(object = noIndex_model,
newdata = val.data,
exclude.trunc = exclude.trunc,
recursive = recursive,
recursive_colRange = recursive_colRange)
noIndex_val_mse <- MSE(residuals = (as.numeric(as.matrix(noIndex_preds[,{{yvar}}], ncol = 1)) - noIndex_preds$.predict))
# null model
pre.formula <- paste(yvar, "~", 1)
null_model <- lm(formula = as.formula(pre.formula), data = data)
null_preds <- predict(object = null_model, newdata = val.data)
null_val_mse <- MSE(residuals = (as.numeric(as.matrix(val.data[,{{yvar}}], ncol = 1)) - null_preds))
# MSE vector
three_mse <- c(current_MSE, noIndex_val_mse, null_val_mse)
}else if(is.null(s.vars) & is.null(linear.vars)){
# null model
pre.formula <- paste(yvar, "~", 1)
null_model <- lm(formula = as.formula(pre.formula), data = data)
null_preds <- predict(object = null_model, newdata = val.data)
null_val_mse <- MSE(residuals = (as.numeric(as.matrix(val.data[,{{yvar}}], ncol = 1)) - null_preds))
# MSE vector
three_mse <- c(current_MSE, NA, null_val_mse)
}
min_three_mse <- which.min(three_mse)
## Fit final model
if(refit == TRUE){
# Re-fit the best model for the combined data set training + validation
data <- dplyr::bind_rows(data, val.data)
}
if(min_three_mse == 1){
final_smimodel_list <- smimodel.fit(data = data, yvar = yvar,
neighbour = neighbour,
family = family,
index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed,
index.ind = index.ind,
index.coefs = index.coefs,
s.vars = s.vars,
linear.vars = linear.vars,
lambda0 = current_lambdas[1],
lambda2 = current_lambdas[2],
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose)
if(verbose$progress)
print("Final SMI model is fitted.")
}else if(min_three_mse == 2){
final_smimodel_list <- smimodel.fit(data = data, yvar = yvar,
neighbour = neighbour,
family = family,
index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed,
index.ind = index.ind,
index.coefs = index.coefs,
s.vars = s.vars,
linear.vars = linear.vars,
lambda0 = 1e+50,
lambda2 = 0,
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose)
message("A null index model is selected as the final model. The final model consists only of s.vars and/or linear.vars.")
}else if(min_three_mse == 3){
final_smimodel_list <- smimodel.fit(data = data, yvar = yvar,
neighbour = neighbour,
family = family,
index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed,
index.ind = index.ind,
index.coefs = index.coefs,
s.vars = NULL,
linear.vars = NULL,
lambda0 = 1e+50,
lambda2 = 0,
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose)
message("A null model is selected as the final model. The final model includes only the intercept.")
}
output <- list("initial" = final_smimodel_list$initial,
"best" = final_smimodel_list$best,
"best_lambdas" = current_lambdas,
"lambda0_seq" = lambda0_seq,
"lambda2_seq" = lambda2_seq,
"searched" = all_comb_info)
return(output)
}
utils::globalVariables(c("Var1", "Var2"))
#' SMI model with a given penalty parameter combination
#'
#' Fits a nonparametric multiple index model to the data for a given combination
#' of the penalty parameters (lambda0, lambda2), and returns the validation set
#' mean squared error (MSE). (Used within \code{\link{greedy.fit}}; users are
#' not expected to use this function directly.)
#'
#' @param data Training data set on which models will be trained. Must be a data
#' set of class \code{tsibble}.(Make sure there are no additional date or time
#' related variables except for the \code{index} of the \code{tsibble}).
#' @param val.data Validation data set. (The data set on which the penalty
#' parameter selection will be performed.) Must be a data set of class
#' \code{tsibble}. (Once the penalty parameter selection is completed, the
#' best model will be re-fitted for the combined data set \code{data +
#' val.data}.)
#' @param yvar Name of the response variable as a character string.
#' @param neighbour `neighbour` argument passed from the outer function.
#' @param family A description of the error distribution and link function to be
#' used in the model (see \code{\link{glm}} and \code{\link{family}}).
#' @param index.vars A \code{character} vector of names of the predictor
#' variables for which indices should be estimated.
#' @param initialise The model structure with which the estimation process
#' should be initialised. The default is \code{"ppr"}, where the initial model
#' is derived from projection pursuit regression. The other options are
#' \code{"additive"} - nonparametric additive model, \code{"linear"} - linear
#' regression model (i.e. a special case single-index model, where the initial
#' values of the index coefficients are obtained through a linear regression),
#' \code{"multiple"} - multiple models are fitted starting with different
#' initial models (number of indices = \code{num_ind}; \code{num_models}
#' random instances of the model (i.e. the predictor assignment to indices and
#' initial index coefficients are generated randomly) are considered), and the
#' final optimal model with the lowest loss is returned, and
#' \code{"userInput"} - user specifies the initial model structure (i.e. the
#' number of indices and the placement of index variables among indices) and
#' the initial index coefficients through \code{index.ind} and
#' \code{index.coefs} arguments respectively.
#' @param num_ind If \code{initialise = "ppr"} or \code{"multiple"}: an
#' \code{integer} that specifies the number of indices to be used in the
#' model(s). The default is \code{num_ind = 5}.
#' @param num_models If \code{initialise = "multiple"}: an \code{integer} that
#' specifies the number of starting models to be checked. The default is
#' \code{num_models = 5}.
#' @param seed If \code{initialise = "multiple"}: the seed to be set when
#' generating random starting points.
#' @param index.ind If \code{initialise = "userInput"}: an \code{integer} vector
#' that assigns group index for each predictor in \code{index.vars}.
#' @param index.coefs If \code{initialise = "userInput"}: a \code{numeric}
#' vector of index coefficients.
#' @param s.vars A \code{character} vector of names of the predictor variables
#' for which splines should be fitted individually (rather than considering as
#' part of an index).
#' @param linear.vars A \code{character} vector of names of the predictor
#' variables that should be included linearly into the model.
#' @param lambda.comb A \code{numeric} vector (of length two) indicating the
#' values for the two penalty parameters lambda0 and lambda2.
#' @param M Big-M value used in MIP.
#' @param max.iter Maximum number of MIP iterations performed to update index
#' coefficients for a given model.
#' @param tol Tolerance for the objective function value (loss) of MIP.
#' @param tolCoefs Tolerance for coefficients.
#' @param TimeLimit A limit for the total time (in seconds) expended in a single
#' MIP iteration.
#' @param MIPGap Relative MIP optimality gap.
#' @param NonConvex The strategy for handling non-convex quadratic objectives or
#' non-convex quadratic constraints in Gurobi solver.
#' @param verbose A named list controlling verbosity options. Defaults to
#' \code{list(solver = FALSE, progress = FALSE)}.
#' \describe{
#' \item{solver}{Logical. If TRUE, print detailed solver output.}
#' \item{progress}{Logical. If TRUE, print optimisation algorithm progress
#' messages.}
#' }
#' @param exclude.trunc The names of the predictor variables that should not be
#' truncated for stable predictions as a character string. (Since the
#' nonlinear functions are estimated using splines, extrapolation is not
#' desirable. Hence, if any predictor variable in `val.data` that is treated
#' non-linearly in the estimated model, will be truncated to be in the
#' in-sample range before obtaining predictions. If any variables are listed
#' here will be excluded from such truncation.)
#' @param recursive Whether to obtain recursive forecasts or not (default -
#' \code{FALSE}).
#' @param recursive_colRange If \code{recursive = TRUE}, the range of column
#' numbers in \code{val.data} to be filled with forecasts.
#' Recursive/autoregressive forecasting is required when the lags of the
#' response variable itself are used as predictor variables into the model.
#' Make sure such lagged variables are positioned together in increasing lag
#' order (i.e. \code{lag_1, lag_2, ..., lag_m}, \code{lag_m =} maximum lag
#' used) in \code{val.data}, with no break in the lagged variable sequence
#' even if some of the intermediate lags are not used as predictors.
#' @return A \code{numeric}.
tune_smimodel <- function(data, val.data, yvar, neighbour = 0,
family = gaussian(), index.vars,
initialise = c("ppr", "additive", "linear",
"multiple", "userInput"),
num_ind = 5, num_models = 5, seed = 123,
index.ind = NULL, index.coefs = NULL,
s.vars = NULL, linear.vars = NULL, lambda.comb = c(1, 1),
M = 10, max.iter = 50, tol = 0.001, tolCoefs = 0.001,
TimeLimit = Inf, MIPGap = 1e-4,
NonConvex = -1,
verbose = list(solver = FALSE, progress = FALSE),
exclude.trunc = NULL,
recursive = FALSE, recursive_colRange = NULL){
verbose_default <- list(solver = FALSE, progress = FALSE)
verbose <- modifyList(verbose_default, verbose)
# Estimating smimodel
smimodel <- smimodel.fit(data = data, yvar = yvar,
neighbour = neighbour,
family = family,
index.vars = index.vars,
initialise = initialise,
num_ind = num_ind, num_models = num_models,
seed = seed,
index.ind = index.ind,
index.coefs = index.coefs,
s.vars = s.vars,
linear.vars = linear.vars,
lambda0 = lambda.comb[[1]], lambda2 = lambda.comb[[2]],
M = M, max.iter = max.iter,
tol = tol, tolCoefs = tolCoefs,
TimeLimit = TimeLimit, MIPGap = MIPGap,
NonConvex = NonConvex, verbose = verbose)
# # Predictions on validation set
pred <- predict(object = smimodel$best, newdata = val.data,
exclude.trunc = exclude.trunc, recursive = recursive,
recursive_colRange = recursive_colRange)$.predict
# Validation set MSE
# Convert to a tibble
index_val <- index(val.data)
val.data <- val.data |>
as_tibble() |>
arrange({{index_val}})
smimodel_mse <- MSE(residuals = (as.numeric(as.matrix(val.data[,{{yvar}}], ncol = 1)) - pred))
return(smimodel_mse)
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.