qreg_mboost: Multiple Quantile Regression using 'mboost'

View source: R/MQR_qreg_mboost.R

qreg_mboostR Documentation

Multiple Quantile Regression using mboost

Description

This function fits multiple quantile regression models using mboost, with facilities for cross-validation. mboost accommodates both generalised additive models, decision trees and other learners. See ?mboost for more details.

Usage

qreg_mboost(
  data,
  formula,
  quantiles = c(0.25, 0.5, 0.75),
  cv_folds = NULL,
  w = rep(1, nrow(data)),
  cores = 1,
  pckgs = NULL,
  sort = T,
  sort_limits = NULL,
  save_models_path = NULL,
  only_mqr = FALSE,
  exclude_train = NULL,
  ...
)

Arguments

data

A data.frame containing target and explanatory variables.

quantiles

The quantiles to fit models for.

cv_folds

Control for cross-validation with various options, either:

  • the column name of the fold index supplied in data. Observations and inputs in the index labelled "Test" will serve as test data and held out in model training.

  • an integer giving the number of cross validation folds to generate. Folds are constructed as block chunks. Default behaviour is 5 folds.

  • vector of length==nrow(data) containing character or numeric fold labels.

  • NULL indicates that no cross validation should be performed and the returned model is trained on all data.

w

an optional numeric vector of weights to be used in the fitting process.

cores

the number of available cores. Defaults to one, i.e. no parallelisation, although in this case the user must still specify pckgs if applicable.

pckgs

specify additional packages required for each worker (e.g. c("data.table") if data stored as such).

sort

Sort quantiles using SortQuantiles()?

sort_limits

Limits argument to be passed to SortQuantiles(). Constrains quantiles to upper and lower limits given by list(U=upperlim,L=lowerlim).

save_models_path

Path to save models. Model details and file extension pasted onto this string. Defaults to NULL, i.e. no model save.

only_mqr

return only the out-of-sample predictions?

exclude_train

control for exclusion of rows in data for the model training only, with various options, either:

  • the column name of the binary/boolean exclude flag supplied in data.

  • a vector of binary/boolean exclusion flags of length nrow(data)

  • NULL indicates no exclusion

This option is useful when out-of-sample predictions are required in rows which need excluded during model training

...

extra hyper-parameters to be passed to mboost(). e.g. use control = mboost::boost_control() to specify boosting steps and shrinkage.

formaula

A formula object with the response on the left of an ~ operator, and the terms, separated by + operators, on the right.

Details

The returned predictive quantiles are those produced out-of-sample for each cross-validation fold (using models trained on the remaining folds but not "Test" data). Predictive quantiles corresponding to "Test" data are produced using models trained on all non-test data.

The returned models are in a named list corresponding to the model for each fold and and can be extracted for further prediction or evaluation. See predict.qreg_mboost().

Value

by default a named list containing fitted models as a list of qreg_mboost objects, and out-of-sample cross validation forecasts as an MultiQR object. The output list depends on cv_folds.

Alternatively returns only the out-of-sample cross validation forecasts as an MultiQR object when only_mqr is TRUE

Author(s)

Jethro Browell, jethro.browell@strath.ac.uk; Ciaran Gilbert, ciaran.gilbert@strath.ac.uk


jbrowell/ProbCast documentation built on July 20, 2024, 1:53 p.m.