The r mlr_pkg("mlr3")
[@mlr3] package and ecosystem provide a generic, object-oriented, and extensible framework for classification, regression, survival analysis, and other machine learning tasks for the R language [@R].
We do not implement any learners ourselves, but provide a unified interface to many existing learners in R.
This unified interface provides functionality to extend and combine existing learners, intelligently select and tune the most appropriate technique for a task, and perform large-scale comparisons that enable meta-learning.
Examples of this advanced functionality include hyperparameter tuning, feature selection, and ensemble construction. Parallelization of many operations is natively supported.
Target Audience
r mlr_pkg("mlr3")
provides a domain-specific language for machine learning in R.
We target both practitioners who want to quickly apply machine learning algorithms and researchers who want to implement, benchmark, and compare their new methods in a structured environment.
The package is a complete rewrite of an earlier version of r mlr_pkg("mlr")
that leverages many years of experience to provide a state-of-the-art system that is easy to use and extend.
It is intended for users who have basic knowledge of machine learning and R and who are interested in complex projects that use advanced functionality as well as one-liners to quickly prototype specific tasks.
Why a Rewrite?
r mlr_pkg("mlr")
[@mlr] was first released to CRAN in 2013, with the core design and architecture dating back much further.
Over time, the addition of many features has led to a considerably more complex design that made it harder to build, maintain, and extend than we had hoped for.
With hindsight, we saw that some of the design and architecture choices in r mlr_pkg("mlr")
made it difficult to support new features, in particular with respect to pipelines.
Furthermore, the R ecosystem as well as helpful packages such as r cran_pkg("data.table")
have undergone major changes in the meantime.
It would have been nearly impossible to integrate all of these changes into the original design of r mlr_pkg("mlr")
.
Instead, we decided to start working on a reimplementation in 2018, which resulted in the first release of r mlr_pkg("mlr3")
on CRAN in July 2019.
The new design and the integration of further and newly developed R packages (R6, future, data.table) makes r mlr_pkg("mlr3")
much easier to use, maintain, and more efficient compared to r mlr_pkg("mlr")
.
Design Principles
We follow these general design principles in the r mlr_pkg("mlr3")
package and ecosystem.
r mlr_pkg("mlr3")
ecosystem focus on processing and transforming data, applying machine learning algorithms, and computing results.
We do not provide graphical user interfaces (GUIs); visualizations of data and results are provided in extra packages.r cran_pkg("R6")
for a clean, object-oriented design, object state-changes, and reference semantics.r cran_pkg("data.table")
for fast and convenient data frame computations.data.table
s.
This considerably simplifies the API and allows easy selection and "split-apply-combine" (aggregation) operations.
We combine data.table
and R6
to place references to non-atomic and compound objects in tables and make heavy use of list columns.checkmate
[@checkmate].
Return types are documented, and mechanisms popular in base R which "simplify" the result unpredictably (e.g., sapply()
or the drop
argument in [.data.frame
) are avoided.r mlr_pkg("mlr")
was to keep up with changing learner interfaces and behavior of the many packages it depended on.
We require far fewer packages in r mlr_pkg("mlr3")
to make installation and maintenance easier.r mlr_pkg("mlr3")
requires the following packages:
r cran_pkg("backports")
:
Ensures backward compatibility with older R releases. Developed by members of the r mlr_pkg("mlr3")
team.r cran_pkg("checkmate")
:
Fast argument checks. Developed by members of the r mlr_pkg("mlr3")
team.r cran_pkg("mlr3misc")
:
Miscellaneous functions used in multiple mlr3 extension packages.
Developed by the r mlr_pkg("mlr3")
team.r cran_pkg("mlr3measures")
:
Performance measures for classification and regression. Developed by members of the r mlr_pkg("mlr3")
team.r cran_pkg("paradox")
:
Descriptions of parameters and parameter sets. Developed by the r mlr_pkg("mlr3")
team.r cran_pkg("R6")
:
Reference class objects.r cran_pkg("data.table")
:
Extension of R's data.frame
.r cran_pkg("digest")
:
Hash digests.r cran_pkg("uuid")
:
Unique string identifiers.r cran_pkg("lgr")
:
Logging facility.r cran_pkg("mlbench")
:
A collection of machine learning data sets.None of these packages adds any extra recursive dependencies to r mlr_pkg("mlr3")
.
r mlr_pkg("mlr3")
provides additional functionality through extra packages:
r mlr_pkg("mlr3")
utilizes the r cran_pkg("future")
and r cran_pkg("future.apply")
packages.r cran_pkg("evaluate")
and r cran_pkg("callr")
can be used.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.