mlr_learners_regr.lightgbm | R Documentation |
Gradient boosting algorithm.
Calls lightgbm::lightgbm()
from lightgbm.
The list of parameters can be found here
and in the documentation of lightgbm::lgb.train()
.
This Learner can be instantiated via lrn():
lrn("regr.lightgbm")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”, “factor”
Required Packages: mlr3, mlr3extralearners, lightgbm
Id | Type | Default | Levels | Range |
objective | character | regression | regression, regression_l1, huber, fair, poisson, quantile, mape, gamma, tweedie | - |
eval | untyped | - | - | |
verbose | integer | 1 | (-\infty, \infty) |
|
record | logical | TRUE | TRUE, FALSE | - |
eval_freq | integer | 1 | [1, \infty) |
|
callbacks | untyped | - | - | |
reset_data | logical | FALSE | TRUE, FALSE | - |
boosting | character | gbdt | gbdt, rf, dart, goss | - |
linear_tree | logical | FALSE | TRUE, FALSE | - |
learning_rate | numeric | 0.1 | [0, \infty) |
|
num_leaves | integer | 31 | [1, 131072] |
|
tree_learner | character | serial | serial, feature, data, voting | - |
num_threads | integer | 0 | [0, \infty) |
|
device_type | character | cpu | cpu, gpu | - |
seed | integer | - | (-\infty, \infty) |
|
deterministic | logical | FALSE | TRUE, FALSE | - |
data_sample_strategy | character | bagging | bagging, goss | - |
force_col_wise | logical | FALSE | TRUE, FALSE | - |
force_row_wise | logical | FALSE | TRUE, FALSE | - |
histogram_pool_size | integer | -1 | (-\infty, \infty) |
|
max_depth | integer | -1 | (-\infty, \infty) |
|
min_data_in_leaf | integer | 20 | [0, \infty) |
|
min_sum_hessian_in_leaf | numeric | 0.001 | [0, \infty) |
|
bagging_fraction | numeric | 1 | [0, 1] |
|
bagging_freq | integer | 0 | [0, \infty) |
|
bagging_seed | integer | 3 | (-\infty, \infty) |
|
bagging_by_query | logical | FALSE | TRUE, FALSE | - |
feature_fraction | numeric | 1 | [0, 1] |
|
feature_fraction_bynode | numeric | 1 | [0, 1] |
|
feature_fraction_seed | integer | 2 | (-\infty, \infty) |
|
extra_trees | logical | FALSE | TRUE, FALSE | - |
extra_seed | integer | 6 | (-\infty, \infty) |
|
max_delta_step | numeric | 0 | (-\infty, \infty) |
|
lambda_l1 | numeric | 0 | [0, \infty) |
|
lambda_l2 | numeric | 0 | [0, \infty) |
|
linear_lambda | numeric | 0 | [0, \infty) |
|
min_gain_to_split | numeric | 0 | [0, \infty) |
|
drop_rate | numeric | 0.1 | [0, 1] |
|
max_drop | integer | 50 | (-\infty, \infty) |
|
skip_drop | numeric | 0.5 | [0, 1] |
|
xgboost_dart_mode | logical | FALSE | TRUE, FALSE | - |
uniform_drop | logical | FALSE | TRUE, FALSE | - |
drop_seed | integer | 4 | (-\infty, \infty) |
|
top_rate | numeric | 0.2 | [0, 1] |
|
other_rate | numeric | 0.1 | [0, 1] |
|
min_data_per_group | integer | 100 | [1, \infty) |
|
max_cat_threshold | integer | 32 | [1, \infty) |
|
cat_l2 | numeric | 10 | [0, \infty) |
|
cat_smooth | numeric | 10 | [0, \infty) |
|
max_cat_to_onehot | integer | 4 | [1, \infty) |
|
top_k | integer | 20 | [1, \infty) |
|
monotone_constraints | untyped | NULL | - | |
monotone_constraints_method | character | basic | basic, intermediate, advanced | - |
monotone_penalty | numeric | 0 | [0, \infty) |
|
feature_contri | untyped | NULL | - | |
forcedsplits_filename | untyped | "" | - | |
refit_decay_rate | numeric | 0.9 | [0, 1] |
|
cegb_tradeoff | numeric | 1 | [0, \infty) |
|
cegb_penalty_split | numeric | 0 | [0, \infty) |
|
cegb_penalty_feature_lazy | untyped | - | - | |
cegb_penalty_feature_coupled | untyped | - | - | |
path_smooth | numeric | 0 | [0, \infty) |
|
interaction_constraints | untyped | - | - | |
use_quantized_grad | logical | TRUE | TRUE, FALSE | - |
num_grad_quant_bins | integer | 4 | (-\infty, \infty) |
|
quant_train_renew_leaf | logical | FALSE | TRUE, FALSE | - |
stochastic_rounding | logical | TRUE | TRUE, FALSE | - |
serializable | logical | TRUE | TRUE, FALSE | - |
max_bin | integer | 255 | [2, \infty) |
|
max_bin_by_feature | untyped | NULL | - | |
min_data_in_bin | integer | 3 | [1, \infty) |
|
bin_construct_sample_cnt | integer | 200000 | [1, \infty) |
|
data_random_seed | integer | 1 | (-\infty, \infty) |
|
is_enable_sparse | logical | TRUE | TRUE, FALSE | - |
enable_bundle | logical | TRUE | TRUE, FALSE | - |
use_missing | logical | TRUE | TRUE, FALSE | - |
zero_as_missing | logical | FALSE | TRUE, FALSE | - |
feature_pre_filter | logical | TRUE | TRUE, FALSE | - |
pre_partition | logical | FALSE | TRUE, FALSE | - |
two_round | logical | FALSE | TRUE, FALSE | - |
forcedbins_filename | untyped | "" | - | |
boost_from_average | logical | TRUE | TRUE, FALSE | - |
reg_sqrt | logical | FALSE | TRUE, FALSE | - |
alpha | numeric | 0.9 | [0, \infty) |
|
fair_c | numeric | 1 | [0, \infty) |
|
poisson_max_delta_step | numeric | 0.7 | [0, \infty) |
|
tweedie_variance_power | numeric | 1.5 | [1, 2] |
|
metric_freq | integer | 1 | [1, \infty) |
|
num_machines | integer | 1 | [1, \infty) |
|
local_listen_port | integer | 12400 | [1, \infty) |
|
time_out | integer | 120 | [1, \infty) |
|
machines | untyped | "" | - | |
gpu_platform_id | integer | -1 | (-\infty, \infty) |
|
gpu_device_id | integer | -1 | (-\infty, \infty) |
|
gpu_use_dp | logical | FALSE | TRUE, FALSE | - |
num_gpu | integer | 1 | [1, \infty) |
|
start_iteration_predict | integer | 0 | (-\infty, \infty) |
|
num_iteration_predict | integer | -1 | (-\infty, \infty) |
|
pred_early_stop | logical | FALSE | TRUE, FALSE | - |
pred_early_stop_freq | integer | 10 | (-\infty, \infty) |
|
pred_early_stop_margin | numeric | 10 | (-\infty, \infty) |
|
num_iterations | integer | 100 | [1, \infty) |
|
early_stopping_rounds | integer | - | [1, \infty) |
|
early_stopping_min_delta | numeric | - | [0, \infty) |
|
first_metric_only | logical | FALSE | TRUE, FALSE | - |
num_threads
:
Actual default: 0L
Initital value: 1L
Reason for change: Prevents accidental conflicts with future
.
verbose
:
Actual default: 1L
Initial value: -1L
Reason for change: Prevents accidental conflicts with mlr messaging system.
Early stopping can be used to find the optimal number of boosting rounds.
Set early_stopping_rounds
to an integer value to monitor the performance of the model on the validation set while training.
For information on how to configure the validation set, see the Validation section of mlr3::Learner
.
The internal validation measure can be set the eval
parameter which should be a list of mlr3::Measures, functions, or strings for the internal lightgbm measures.
If first_metric_only = FALSE
(default), the learner stops when any metric fails to improve.
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrLightGBM
internal_valid_scores
The last observation of the validation scores for all metrics.
Extracted from model$evaluation_log
internal_tuned_values
Returns the early stopped iterations if early_stopping_rounds
was set during training.
validate
How to construct the internal validation data.
This parameter can be either NULL
, a ratio, "test"
, or "predefined"
.
new()
Creates a new instance of this R6 class.
LearnerRegrLightGBM$new()
importance()
The importance scores are extracted from lbg.importance
.
LearnerRegrLightGBM$importance()
Named numeric()
.
clone()
The objects of this class are cloneable with this method.
LearnerRegrLightGBM$clone(deep = FALSE)
deep
Whether to make a deep clone.
kapsner
Ke, Guolin, Meng, Qi, Finley, Thomas, Wang, Taifeng, Chen, Wei, Ma, Weidong, Ye, Qiwei, Liu, Tie-Yan (2017). “Lightgbm: A highly efficient gradient boosting decision tree.” Advances in neural information processing systems, 30.
Dictionary of Learners: mlr3::mlr_learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
mlr3learners for a selection of recommended learners.
mlr3cluster for unsupervised clustering learners.
mlr3pipelines to combine learners with pre- and postprocessing steps.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
# Define the Learner
learner = mlr3::lrn("regr.lightgbm")
print(learner)
# Define a Task
task = mlr3::tsk("mtcars")
# Create train and test set
ids = mlr3::partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
print(learner$model)
print(learner$importance())
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.