View source: R/049_sym_opt_ln.R
| sym_opt_ln | R Documentation | 
Introduction of Symmetric optimistic label noise into a classification dataset.
## Default S3 method: sym_opt_ln(x, y, level, levelH = 0.9, order = levels(y), sortid = TRUE, ...) ## S3 method for class 'formula' sym_opt_ln(formula, data, ...)
x | 
 a data frame of input attributes.  | 
y | 
 a factor vector with the output class of each sample.  | 
level | 
 a double in [0,1] with the noise level to be introduced.  | 
levelH | 
 a double in (0.5, 1] with the noise level for higher classes (default: 0.9).  | 
order | 
 a character vector indicating the order of the classes (default:   | 
sortid | 
 a logical indicating if the indices must be sorted at the output (default:   | 
... | 
 other options to pass to the function.  | 
formula | 
 a formula with the output class and, at least, one input attribute.  | 
data | 
 a data frame in which to interpret the variables in the formula.  | 
Symmetric optimistic label noise randomly selects (level·100)% of the samples
in the dataset with independence of their class. 
In the optimistic case, the probability of a class i of being mislabeled as class j is 
higher for j > i in comparison to j < i.
Thus, when noise for a certain class occurs, it is assigned to a random higher class with probability levelH 
and to a random lower class with probability 1-levelH. The order of the classes is determined by 
order.
An object of class ndmodel with elements:
xnoise | 
 a data frame with the noisy input attributes.  | 
ynoise | 
 a factor vector with the noisy output class.  | 
numnoise | 
 an integer vector with the amount of noisy samples per class.  | 
idnoise | 
 an integer vector list with the indices of noisy samples.  | 
numclean | 
 an integer vector with the amount of clean samples per class.  | 
idclean | 
 an integer vector list with the indices of clean samples.  | 
distr | 
 an integer vector with the samples per class in the original data.  | 
model | 
 the full name of the noise introduction model used.  | 
param | 
 a list of the argument values.  | 
call | 
 the function call.  | 
Noise model adapted from the papers in References.
R. C. Prati, J. Luengo, and F. Herrera. Emerging topics and challenges of learning from noisy data in nonstandard classification: a survey beyond binary class noise. Knowledge and Information Systems, 60(1):63–97, 2019. doi: 10.1007/s10115-018-1244-4.
sym_usim_ln, sym_natd_ln, print.ndmodel, summary.ndmodel, plot.ndmodel
# load the dataset
data(iris2D)
# usage of the default method
set.seed(9)
outdef <- sym_opt_ln(x = iris2D[,-ncol(iris2D)], y = iris2D[,ncol(iris2D)], 
                     level = 0.1, order = c("virginica", "setosa", "versicolor"))
# show results
summary(outdef, showid = TRUE)
plot(outdef)
# usage of the method for class formula
set.seed(9)
outfrm <- sym_opt_ln(formula = Species ~ ., data = iris2D, 
                     level = 0.1, order = c("virginica", "setosa", "versicolor"))
# check the match of noisy indices
identical(outdef$idnoise, outfrm$idnoise)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.