knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)

Install the lightgbm R package with GPU support.

Before you can use the mlr3learners.lightgbm package with GPU acceleration, you need to install the lightgbm R package according to its documentation (this is necessary since lightgbm is neither on CRAN nor installable via devtools::install_github). You can compile the GPU version on Linux/ in docker with the following commands:

git clone --recursive --branch stable --depth 1 https://github.com/microsoft/LightGBM
cd LightGBM && \
Rscript build_r.R --use-gpu

Then you can install the mlr3learners.lightgbm package:

install.packages("devtools")
devtools::install_github("kapsner/mlr3learners.lightgbm")
library(mlr3)
library(mlr3learners.lightgbm)

Create the mlr3 task

task = mlr3::tsk("iris")

To have independent validation data and test data, we further create a list split, containing the respective row indices.

set.seed(17)
split = list(
  train_index = sample(seq_len(task$nrow), size = 0.7 * task$nrow)
)
split$test_index = setdiff(seq_len(task$nrow), split$train_index)

table(task$data()[split$train_index, task$target_names, with = F])
table(task$data()[split$test_index, task$target_names, with = F])

Instantiate the lightgbm learner

Then, the classif.lightgbm class needs to be instantiated:

learner = mlr3::lrn("classif.lightgbm")

Configure the learner

In the next step, some of the learner's parameters need to be set. E.g., the parameters num_iterations and early_stopping_round can be set here. Please refer to the LightGBM manual for further details these parameters. Almost all possible parameters have been implemented here. You can inspect them using the following command:

learner$param_set

In order to use the GPU acceleration, the parameter device_type = "gpu" (default: "cpu") needs to be set. According to the LightGBM parameter manual, 'it is recommended to use the smaller max_bin (e.g. 63) to get the better speed up'.

learner$param_set$values = mlr3misc::insert_named(
  learner$param_set$values,
    list(
      "objective" = "multiclass",
      "device_type" = "gpu",
      "max_bin" = 63L,
      "early_stopping_round" = 10,
      "learning_rate" = 0.1,
      "seed" = 17L,
      "metric" = "multi_logloss",
      "num_iterations" = 100,
      "num_class" = 3
      )
  )

Train the learner

The learner is now ready to be trained by using its train function.

learner$train(task, row_ids = split$train_index)

Evaluate the model performance

Basic metrics can be assessed directly from the learner model:

learner$model$current_iter()

The learner's predict function returns an object of mlr3's class PredictionClassif.

predictions = learner$predict(task, row_ids = split$test_index)
head(predictions$response)

The predictions object includes also a confusion matrix:

predictions$confusion

Further metrics can be calculated by using mlr3 measures:

predictions$score(mlr3::msr("classif.logloss"))

The variable importance plot can be calculated by using the learner's importance function:

importance = learner$importance()
importance


kapsner/mlr3learners.lightgbm documentation built on Feb. 17, 2021, 5:53 p.m.