ebr: Ensemble of Binary Relevance for multi-label Classification

Description Usage Arguments Details Value Note References See Also Examples

View source: R/method_ebr.R

Description

Create an Ensemble of Binary Relevance model for multilabel classification.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
ebr(
  mdata,
  base.algorithm = getOption("utiml.base.algorithm", "SVM"),
  m = 10,
  subsample = 0.75,
  attr.space = 0.5,
  replacement = TRUE,
  ...,
  cores = getOption("utiml.cores", 1),
  seed = getOption("utiml.seed", NA)
)

Arguments

mdata

A mldr dataset used to train the binary models.

base.algorithm

A string with the name of the base algorithm. (Default: options("utiml.base.algorithm", "SVM"))

m

The number of Binary Relevance models used in the ensemble. (Default: 10)

subsample

A value between 0.1 and 1 to determine the percentage of training instances that must be used for each classifier. (Default: 0.75)

attr.space

A value between 0.1 and 1 to determine the percentage of attributes that must be used for each classifier. (Default: 0.50)

replacement

Boolean value to define if use sampling with replacement to create the data of the models of the ensemble. (Default: TRUE)

...

Others arguments passed to the base algorithm for all subproblems.

cores

The number of cores to parallelize the training. Values higher than 1 require the parallel package. (Default: options("utiml.cores", 1))

seed

An optional integer used to set the seed. This is useful when the method is run in parallel. (Default: options("utiml.seed", NA))

Details

This model is composed by a set of Binary Relevance models. Binary Relevance is a simple and effective transformation method to predict multi-label data.

Value

An object of class EBRmodel containing the set of fitted BR models, including:

models

A list of BR models.

nrow

The number of instances used in each training dataset.

ncol

The number of attributes used in each training dataset.

rounds

The number of interactions.

Note

If you want to reproduce the same classification and obtain the same result will be necessary set a flag utiml.mc.set.seed to FALSE.

References

Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine Learning, 85(3), 333-359.

Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2009). Classifier Chains for Multi-label Classification. Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, 5782, 254-269.

See Also

Other Transformation methods: brplus(), br(), cc(), clr(), dbr(), ecc(), eps(), esl(), homer(), lift(), lp(), mbr(), ns(), ppt(), prudent(), ps(), rakel(), rdbr(), rpc()

Other Ensemble methods: ecc(), eps()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
model <- ebr(toyml, "RANDOM")
pred <- predict(model, toyml)


# Use C5.0 with 90% of instances and only 5 rounds
model <- ebr(toyml, 'C5.0', m = 5, subsample = 0.9)

# Use 75% of attributes
model <- ebr(toyml, attr.space = 0.75)

# Running in 2 cores and define a specific seed
model1 <- ebr(toyml, cores=2, seed = 312)

utiml documentation built on May 31, 2021, 9:09 a.m.