knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE )
Before you can use the mlr3learners.lightgbm
package with GPU acceleration, you need to install the lightgbm
R package according to its documentation (this is necessary since lightgbm is neither on CRAN nor installable via devtools::install_github
).
You can compile the GPU version on Linux/ in docker with the following commands:
git clone --recursive --branch stable --depth 1 https://github.com/microsoft/LightGBM cd LightGBM && \ Rscript build_r.R --use-gpu
Then you can install the mlr3learners.lightgbm
package:
install.packages("devtools") devtools::install_github("kapsner/mlr3learners.lightgbm")
library(mlr3) library(mlr3learners.lightgbm)
task = mlr3::tsk("iris")
To have independent validation data and test data, we further create a list split
, containing the respective row indices.
set.seed(17) split = list( train_index = sample(seq_len(task$nrow), size = 0.7 * task$nrow) ) split$test_index = setdiff(seq_len(task$nrow), split$train_index) table(task$data()[split$train_index, task$target_names, with = F]) table(task$data()[split$test_index, task$target_names, with = F])
Then, the classif.lightgbm
class needs to be instantiated:
learner = mlr3::lrn("classif.lightgbm")
In the next step, some of the learner's parameters need to be set. E.g., the parameters num_iterations
and early_stopping_round
can be set here. Please refer to the LightGBM manual for further details these parameters. Almost all possible parameters have been implemented here. You can inspect them using the following command:
learner$param_set
In order to use the GPU acceleration, the parameter device_type = "gpu"
(default: "cpu") needs to be set. According to the LightGBM parameter manual, 'it is recommended to use the smaller max_bin
(e.g. 63) to get the better speed up'.
learner$param_set$values = mlr3misc::insert_named( learner$param_set$values, list( "objective" = "multiclass", "device_type" = "gpu", "max_bin" = 63L, "early_stopping_round" = 10, "learning_rate" = 0.1, "seed" = 17L, "metric" = "multi_logloss", "num_iterations" = 100, "num_class" = 3 ) )
The learner is now ready to be trained by using its train
function.
learner$train(task, row_ids = split$train_index)
Basic metrics can be assessed directly from the learner model:
learner$model$current_iter()
The learner's predict
function returns an object of mlr3's class PredictionClassif
.
predictions = learner$predict(task, row_ids = split$test_index) head(predictions$response)
The predictions object includes also a confusion matrix:
predictions$confusion
Further metrics can be calculated by using mlr3 measures:
predictions$score(mlr3::msr("classif.logloss"))
The variable importance plot can be calculated by using the learner's importance
function:
importance = learner$importance() importance
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.