s_GBM | R Documentation |
Train a GBM model using gbm::gbm.fit
s_GBM(
x,
y = NULL,
x.test = NULL,
y.test = NULL,
weights = NULL,
ifw = TRUE,
ifw.type = 2,
upsample = FALSE,
downsample = FALSE,
resample.seed = NULL,
distribution = NULL,
interaction.depth = 2,
shrinkage = 0.01,
bag.fraction = 0.9,
n.minobsinnode = 5,
n.trees = 2000,
max.trees = 5000,
force.n.trees = NULL,
gbm.select.smooth = FALSE,
n.new.trees = 500,
min.trees = 50,
failsafe.trees = 500,
imetrics = FALSE,
.gs = FALSE,
grid.resample.params = setup.resample("kfold", 5),
gridsearch.type = "exhaustive",
metric = NULL,
maximize = NULL,
plot.tune.error = FALSE,
n.cores = rtCores,
relInf = TRUE,
varImp = FALSE,
offset = NULL,
var.monotone = NULL,
keep.data = TRUE,
var.names = NULL,
response.name = "y",
checkmods = FALSE,
group = NULL,
plot.perf = FALSE,
plot.res = ifelse(!is.null(outdir), TRUE, FALSE),
plot.fitted = NULL,
plot.predicted = NULL,
print.plot = FALSE,
plot.theme = rtTheme,
x.name = NULL,
y.name = NULL,
question = NULL,
verbose = TRUE,
trace = 0,
grid.verbose = verbose,
gbm.fit.verbose = FALSE,
outdir = NULL,
save.gridrun = FALSE,
save.res = FALSE,
save.res.mod = FALSE,
save.mod = ifelse(!is.null(outdir), TRUE, FALSE)
)
x |
Numeric vector or matrix / data frame of features i.e. independent variables |
y |
Numeric vector of outcome, i.e. dependent variable |
x.test |
Numeric vector or matrix / data frame of testing set features
Columns must correspond to columns in |
y.test |
Numeric vector of testing set outcome |
weights |
Numeric vector: Weights for cases. For classification, |
ifw |
Logical: If TRUE, apply inverse frequency weighting
(for Classification only).
Note: If |
ifw.type |
Integer 0, 1, 2 1: class.weights as in 0, divided by min(class.weights) 2: class.weights as in 0, divided by max(class.weights) |
upsample |
Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Note: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness |
downsample |
Logical: If TRUE, downsample majority class to match size of minority class |
resample.seed |
Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed) |
distribution |
Character: Distribution of the response variable. See gbm::gbm |
interaction.depth |
[gS] Integer: Interaction depth. |
shrinkage |
[gS] Float: Shrinkage (learning rate). |
bag.fraction |
[gS] Float (0, 1): Fraction of cases to use to train each tree. Helps avoid overfitting. |
n.minobsinnode |
[gS] Integer: Minimum number of observation allowed in node. |
n.trees |
Integer: Initial number of trees to fit |
max.trees |
Integer: Maximum number of trees to fit |
force.n.trees |
Integer: If specified, use this number of trees instead of tuning number of trees |
gbm.select.smooth |
Logical: If TRUE, smooth the validation error curve. |
n.new.trees |
Integer: Number of new trees to train if stopping criteria have not been met. |
min.trees |
Integer: Minimum number of trees to fit. |
failsafe.trees |
Integer: If tuning fails to find n.trees, use this number instead. |
imetrics |
Logical: If TRUE, save |
.gs |
Internal use only |
grid.resample.params |
List: Output of setup.resample defining grid search parameters. |
gridsearch.type |
Character: Type of grid search to perform: "exhaustive" or "randomized". |
metric |
Character: Metric to minimize, or maximize if
|
maximize |
Logical: If TRUE, |
plot.tune.error |
Logical: If TRUE, plot the tuning error curve. |
n.cores |
Integer: Number of cores to use. |
relInf |
Logical: If TRUE (Default), estimate variables' relative influence. |
varImp |
Logical: If TRUE, estimate variable importance by permutation (as in random forests; noted as experimental in gbm). Takes longer than (default) relative influence. The two measures are highly correlated. |
offset |
Numeric vector of offset values, passed to |
var.monotone |
Integer vector with values 0, 1, -1 and length = N features.
Used to define monotonicity constraints. |
plot.fitted |
Logical: if TRUE, plot True (y) vs Fitted |
plot.predicted |
Logical: if TRUE, plot True (y.test) vs Predicted.
Requires |
print.plot |
Logical: if TRUE, produce plot using |
plot.theme |
Character: "zero", "dark", "box", "darkbox" |
x.name |
Character: Name for feature set |
y.name |
Character: Name for outcome |
question |
Character: the question you are attempting to answer with this model, in plain language. |
verbose |
Logical: If TRUE, print summary to screen. |
grid.verbose |
Logical: Passed to |
outdir |
Character: If defined, save log, 'plot.all' plots (see above) and RDS file of complete output |
save.gridrun |
Logical: If TRUE, save grid search models. |
save.res.mod |
Logical: If TRUE, save gbm model for each grid run. For diagnostic purposes only: Object size adds up quickly |
save.mod |
Logical: If TRUE, save all output to an RDS file in |
Early stopping is implemented by fitting n.trees
initially, checking the
optionally smoothed validation error curve, and adding n.new.trees
if
needed, until error does not reduce or max.trees
is reached.
[gS] in the argument description indicates that a vector of values can be
passed, in which case grid search will be performed automatically using the
resampling scheme defined by grid.resample.params
.
This function includes a workaround for when gbm.fit
fails.
If an error is detected, gbm.fit
is rerun until successful and the
procedure continues normally
E.D. Gennatas
train_cv for external cross-validation
Other Supervised Learning:
s_AdaBoost()
,
s_AddTree()
,
s_BART()
,
s_BRUTO()
,
s_BayesGLM()
,
s_C50()
,
s_CART()
,
s_CTree()
,
s_EVTree()
,
s_GAM()
,
s_GLM()
,
s_GLMNET()
,
s_GLMTree()
,
s_GLS()
,
s_H2ODL()
,
s_H2OGBM()
,
s_H2ORF()
,
s_HAL()
,
s_KNN()
,
s_LDA()
,
s_LM()
,
s_LMTree()
,
s_LightCART()
,
s_LightGBM()
,
s_MARS()
,
s_MLRF()
,
s_NBayes()
,
s_NLA()
,
s_NLS()
,
s_NW()
,
s_PPR()
,
s_PolyMARS()
,
s_QDA()
,
s_QRNN()
,
s_RF()
,
s_RFSRC()
,
s_Ranger()
,
s_SDA()
,
s_SGD()
,
s_SPLS()
,
s_SVM()
,
s_TFN()
,
s_XGBoost()
,
s_XRF()
Other Tree-based methods:
s_AdaBoost()
,
s_AddTree()
,
s_BART()
,
s_C50()
,
s_CART()
,
s_CTree()
,
s_EVTree()
,
s_GLMTree()
,
s_H2OGBM()
,
s_H2ORF()
,
s_LMTree()
,
s_LightCART()
,
s_LightGBM()
,
s_MLRF()
,
s_RF()
,
s_RFSRC()
,
s_Ranger()
,
s_XGBoost()
,
s_XRF()
Other Ensembles:
s_AdaBoost()
,
s_RF()
,
s_Ranger()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.