mlr: Machine Learning in R

This Vignette is supposed to give you a short introductory glance at the key features of mlr. A more detailed in depth and continuously updated tutorial can be found on the GitHub project page:

Purpose

The main goal of mlr is to provide a unified interface for machine learning tasks as classification, regression, cluster analysis and survival analysis in R. In lack of a common interface it becomes a hassle to carry out standard methods like cross-validation and hyperparameter tuning for different learners. Hence, mlr offers the following features:

Quick Start

To highlight the main principles of mlr we give a quick introduction to the package. We demonstrate how to simply perform a classification analysis using a stratified cross validation, which illustrates some of the major building blocks of the mlr workflow, namely tasks and learners.

library(mlr)
data(iris)

## Define the task:
task = makeClassifTask(id = "tutorial", data = iris, target = "Species")
print(task)

## Define the learner:
lrn = makeLearner("classif.lda")
print(lrn)

## Define the resampling strategy:
rdesc = makeResampleDesc(method = "CV", stratify = TRUE)

## Do the resampling:
r = resample(learner = lrn, task = task, resampling = rdesc)
print(r)

## Get the mean misclassification error:
r$aggr

Detailed Tutorial

The previous example just demonstrated a tiny fraction of the capabilities of mlr. More features are covered in the tutorial which can be found online on the mlr project page. It covers among others: benchmarking, preprocessing, imputation, feature selection, ROC analysis, how to implement your own learner and the list of all supported learners. Reading is highly recommended!

Thanks

We would like to thank the authors of all packages which mlr uses under the hood:

parsePkgs = function(x) {
  x = strsplit(x, "\n|,")[[1L]]
  # remove version requirement in (...)
  x = sub("\\(.*\\)", "", x)
  # trim whitespace (cannot be inside name)
  x = gsub(" ", "", x)
  # empty string become char(0)
  x[nzchar(x)]
}

desc = packageDescription("mlr")
pkgs = c(parsePkgs(desc$Depends), parsePkgs(desc$Imports), parsePkgs(desc$Suggests))
pkgs = sort(setdiff(pkgs, c("R", "stats", "methods", "utils")))
cat(sprintf("* [%1$s](https://cran.r-project.org/package=%1$s)", pkgs), sep = "\n")


guillermozbta/mir documentation built on May 11, 2019, 6:27 p.m.