qreg_gbm: Multiple Quantile Regression Using Gradient Boosted Decision...

View source: R/MQR_gbm.R

qreg_gbmR Documentation

Multiple Quantile Regression Using Gradient Boosted Decision Trees

Description

This function fits multiple boosted quantile regression trees using gbm with facilities for cross-validation.

Usage

qreg_gbm(
  data,
  formula,
  quantiles = c(0.25, 0.5, 0.75),
  cv_folds = NULL,
  perf.plot = FALSE,
  pred_ntree = NULL,
  cores = 1,
  pckgs = NULL,
  sort = TRUE,
  sort_limits = NULL,
  save_models_path = NULL,
  only_mqr = FALSE,
  exclude_train = NULL,
  ...
)

Arguments

data

A data.frame containing target and explanatory variables.

quantiles

The quantiles to fit models for.

cv_folds

Control for cross-validation with various options, either:

  • the column name of the fold index supplied in data. Observations and inputs in the index labelled "Test" will serve as test data and held out in model training.

  • an integer giving the number of cross validation folds to generate. Folds are constructed as block chunks. Default behaviour is 5 folds.

  • vector of length==nrow(data) containing character or numeric fold labels.

  • NULL indicates that no cross validation should be performed and the returned model is trained on all data.

perf.plot

Plot GBM performance?

pred_ntree

predict using a user-specified tree. If NULL gbm::gbm.perf() is used to estimate the best tree via out-of-the-bag estimates, unless internal gbm cross-validation folds are specified in ....

cores

the number of available cores. Defaults to one, i.e. no parallelisation, although in this case the user must still specify pckgs if applicable.

pckgs

specify additional packages required for each worker (e.g. c("data.table") if data stored as such).

sort

Sort quantiles using SortQuantiles()?

sort_limits

Limits argument to be passed to SortQuantiles(). Constrains quantiles to upper and lower limits given by list(U=upperlim,L=lowerlim).

save_models_path

Path to save models. Model details and file extension pasted onto this string. Defaults to NULL, i.e. no model save.

only_mqr

return only the out-of-sample predictions?

exclude_train

control for exclusion of rows in data for the model training only, with various options, either:

  • the column name of the binary/boolean exclude flag supplied in data.

  • a vector of binary/boolean exclusion flags of length nrow(data)

  • NULL indicates no exclusion

This option is useful when out-of-sample predictions are required in rows which need excluded during model training

...

Additional arguments passed to gbm().

formala

A formula object with the response on the left of an ~ operator, and the terms, separated by + operators, on the right

Details

The returned predictive quantiles are those produced out-of-sample for each cross-validation fold (using models trained on the remaining folds but not "Test" data). Predictive quantiles corresponding to "Test" data are produced using models trained on all non-test data.

The returned models are in a named list corresponding to the model for each fold and and can be extracted for further prediction or evaluation. See predict.qreg_gbm().

Value

by default a named list containing fitted models as a list of qreg_gbm objects, and out-of-sample cross validation forecasts as an MultiQR object. The output list depends on cv_folds.

Alternatively returns only the out-of-sample cross validation forecasts as an MultiQR object when only_mqr is TRUE

Author(s)

Jethro Browell, jethro.browell@strath.ac.uk; Ciaran Gilbert, ciaran.gilbert@strath.ac.uk


jbrowell/ProbCast documentation built on July 20, 2024, 1:53 p.m.