R/data.R

#' Toy multi-label dataset.
#'
#' A toy multi-label dataset is a synthetic dataset generated by the tool
#' \url{http://sites.labic.icmc.usp.br/mldatagen/} using the Hyperspheres
#' strategy. Its purpose is to be used for small tests and examples.
#'
#' @format A mldr object with 100 instances, 10 features and 5 labels:
#' \describe{
#'   \item{att1}{Relevant numeric attribute between (-1 and 1)}
#'   \item{att2}{Relevant numeric attribute between (-1 and 1)}
#'   \item{att3}{Relevant numeric attribute between (-1 and 1)}
#'   \item{att4}{Relevant numeric attribute between (-1 and 1)}
#'   \item{att5}{Relevant numeric attribute between (-1 and 1)}
#'   \item{att6}{Relevant numeric attribute between (-1 and 1)}
#'   \item{att7}{Relevant numeric attribute between (-1 and 1)}
#'   \item{iatt8}{Irrelevant numeric attribute between (-1 and 1)}
#'   \item{iatt9}{Irrelevant numeric attribute between (-1 and 1)}
#'   \item{ratt10}{Redundant numeric attribute between (-1 and 1)}
#'   \item{y1}{Label 'y1' - Frequency: 0.17}
#'   \item{y2}{Label 'y2' - Frequency: 0.78}
#'   \item{y3}{Label 'y3' - Frequency: 0.19}
#'   \item{y4}{Label 'y4' - Frequency: 0.69}
#'   \item{y5}{Label 'y5' - Frequency: 0.17}
#' }
#'
#' @details General Information
#' \itemize{
#'  \item Cardinality: 2
#'  \item Density: 0.4
#'  \item Distinct multi-labels: 18
#'  \item Number of single labelsets: 5
#'  \item Max frequency: 23
#' }
#'
#' @source Generated by \url{http://sites.labic.icmc.usp.br/mldatagen/}
#' Configuration:
#' \itemize{
#'   \item Strategy: Hyperspheres
#'   \item Relevant Features: 7
#'   \item Irrelevant Features: 2
#'   \item Redundant Features: 1
#'   \item Number of Labels (q): 5
#'   \item Number of Instances: 100
#'   \item Noise (from 0 to 1): 0.05
#'   \item Maximum Radius/Half-Edge of the Hyperspheres/Hypercubes: 0.8
#'   \item Minimum Radius/Half-Edge of the Hyperspheres/Hypercubes: ((q/10)+1)/q
#' }
"toyml"

#' Foodtruck multi-label dataset.
#'
#' The foodtruck multi-label dataset is a real multi-label dataset, which uses
#' habits and personal information to predict food truck cuisines.
#'
#' @format A mldr object with 407 instances, 21 features and 12 labels:
#'
#' @details General Information
#' \itemize{
#'  \item Cardinality: 2.28
#'  \item Density: 0.19
#'  \item Distinct multi-labels: 117
#'  \item Number of single labelsets: 74
#'  \item Max frequency: 114
#' }
#'
#' @source The dataset is described in:
#' Rivolli A., Parker L.C., de Carvalho A.C.P.L.F. (2017) Food Truck
#' Recommendation Using Multi-label Classification. In: Oliveira E., Gama J.,
#' Vale Z., Lopes Cardoso H. (eds) Progress in Artificial Intelligence. EPIA
#' 2017. Lecture Notes in Computer Science, vol 10423. Springer, Cham
"foodtruck"
rivolli/utiml documentation built on June 1, 2021, 11:48 p.m.