utiml: utiml: Utilities for Multi-Label Learning

Description Details Notes Author(s)

Description

The utiml package is a framework for the application of classification algorithms to multi-label data. Like the well known MULAN used with Weka, it provides a set of multi-label procedures such as sampling methods, transformation strategies, threshold functions, pre-processing techniques and evaluation metrics. The package was designed to allow users to easily perform complete multi-label classification experiments in the R environment.

Details

Currently, the main methods supported are:

  1. Classification methods: ML Baselines, Binary Relevance (BR), BR+, Classifier Chains, Calibrated Label Ranking (CLR), ConTRolled Label correlation exploitation (CTRL), Dependent Binary Relevance (DBR), Ensemble of Binary Relevance (EBR), Ensemble of Classifier Chains (ECC), Ensemble of Pruned Set (EPS), Hierarchy Of Multilabel classifiER (HOMER), Label specIfic FeaTures (LIFT), Label Powerset (LP), Meta-Binary Relevance (MBR or 2BR), Multi-label KNN (ML-KNN), Nested Stacking (NS), Pruned Problem Transformation (PPT), Pruned and Confident Stacking Approach (Prudent), Pruned Set (PS), Random k-labelsets (RAkEL), Recursive Dependent Binary Relevance (RDBR), Ranking by Pairwise Comparison (RPC)

  2. Evaluation methods: Performing a cross-validation procedure, Confusion Matrix, Evaluate, Supported measures

  3. Pre-process utilities: Fill sparse data, Normalize data, Remove attributes, Remove labels, Remove skewness labels, Remove unique attributes, Remove unlabeled instances, Replace nominal attributes

  4. Sampling methods: Create holdout partitions, Create k-fold partitions, Create random subset, Create subset, Partition fold

  5. Threshold methods: Fixed threshold, Cardinality threshold, MCUT, PCUT, RCUT, SCUT, Subset correction

However, there are other utilities methods not previously cited as as.bipartition, as.mlresult, as.ranking, multilabel_prediction, etc. More details and examples are available on utiml repository.

Notes

We use the mldr package, to manipulate multi-label data. See its documentation to more information about handle multi-label dataset.

Author(s)

This package is a result of my PhD at Institute of Mathematics and Computer Sciences (ICMC) at the University of Sao Paulo, Brazil.

PhD advisor: Andre C. P. L. F. de Carvalho


utiml documentation built on April 20, 2018, 1:04 a.m.