create_holdout_partition: Create a holdout partition based on the specified algorithm
In utiml: Utilities for Multi-Label Learning

Description Usage Arguments Value Note References See Also Examples

This method creates multi-label dataset for train, test, validation or other proposes the partition method defined in method. The number of partitions is defined in partitions parameter. Each instance is used in only one partition of division.

create_holdout_partition(
  mdata,
  partitions = c(train = 0.7, test = 0.3),
  method = c("random", "iterative", "stratified")
)

mdata

A mldr dataset.

partitions

A list of percentages or a single value. The sum of all values does not be greater than 1. If a single value is informed then the complement of them is applied to generated the second partition. If two or more values are informed and the sum of them is lower than 1 the partitions will be generated with the informed proportion. If partitions have names, they are used to name the return. (Default: c(train=0.7, test=0.3)).

method

The method to split the data. The default methods are:

random: Split randomly the folds.
iterative: Split the folds considering the labels proportions individually. Some specific label can not occurs in all folds.
stratified: Split the folds considering the labelset proportions.

You can also create your own partition method. See the note and example sections to more details. (Default: "random")

A list with at least two datasets sampled as specified in partitions parameter.

To create your own split method, you need to build a function that receive a mldr object and a list with the proportions of examples in each fold and return an other list with the index of the elements for each fold.

Sechidis, K., Tsoumakas, G., & Vlahavas, I. (2011). On the stratification of multi-label data. In Proceedings of the Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD (pp. 145-158).

Other sampling: create_kfold_partition(), create_random_subset(), create_subset()

dataset <- create_holdout_partition(toyml)
names(dataset)
## [1] "train" "test"
#dataset$train
#dataset$test

dataset <- create_holdout_partition(toyml, c(a=0.1, b=0.2, c=0.3, d=0.4))
#' names(dataset)
#' ## [1] "a" "b" "c" "d"

sequencial_split <- function (mdata, r) {
 S <- list()

 amount <- trunc(r * mdata$measures$num.instances)
 indexes <- c(0, cumsum(amount))
 indexes[length(r)+1] <- mdata$measures$num.instances

 S <- lapply(seq(length(r)), function (i) {
   seq(indexes[i]+1, indexes[i+1])
 })

 S
}
dataset <- create_holdout_partition(toyml, method="sequencial_split")

Loading required package: mldr
[1] "train" "test"

utiml documentation built on May 31, 2021, 9:09 a.m.

utiml index

README.md utiml: Utilities for multi-label learning

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

utiml
Utilities for Multi-Label Learning

create_holdout_partition: Create a holdout partition based on the specified algorithm
In utiml: Utilities for Multi-Label Learning

Description

Usage

Arguments

Value

Note

References

See Also

Examples

Example output

Related to create_holdout_partition in utiml...

R Package Documentation

Browse R Packages

We want your feedback!

utiml Utilities for Multi-Label Learning

create_holdout_partition: Create a holdout partition based on the specified algorithm In utiml: Utilities for Multi-Label Learning

Description

Usage

Arguments

Value

Note

References

See Also

Examples

Example output

Related to create_holdout_partition in utiml...

R Package Documentation

Browse R Packages

We want your feedback!

utiml
Utilities for Multi-Label Learning

create_holdout_partition: Create a holdout partition based on the specified algorithm
In utiml: Utilities for Multi-Label Learning