View source: R/029_qua_uni_ln.R
qua_uni_ln | R Documentation |
Introduction of Quadrant-based uniform label noise into a classification dataset.
## Default S3 method: qua_uni_ln(x, y, level, att1 = 1, att2 = 2, sortid = TRUE, ...) ## S3 method for class 'formula' qua_uni_ln(formula, data, ...)
x |
a data frame of input attributes. |
y |
a factor vector with the output class of each sample. |
level |
a double vector with the noise levels in [0,1] in each quadrant. |
att1 |
an integer with the index of the first attribute forming the quadrants (default: 1). |
att2 |
an integer with the index of the second attribute forming the quadrants (default: 2). |
sortid |
a logical indicating if the indices must be sorted at the output (default: |
... |
other options to pass to the function. |
formula |
a formula with the output class and, at least, one input attribute. |
data |
a data frame in which to interpret the variables in the formula. |
For each sample, the probability of flipping its label is based on which quadrant
(with respect to the attributes att1
and att2
) the sample falls in.
The probability of mislabeling for each quadrant is expressed with the argument level
,
whose length is equal to 4.
Let m1 and m2 be the mean values of the domain of att1
and att2
, respectively.
Each quadrant is defined as follows: values <= m1
and <= m2 (first quadrant); values <= m1 and > m2 (second quadrant);
values > m1 and <= m2 (third quadrant); and values > m1
and > m2 (fourth quadrant). Finally, the labels of these samples are randomly
replaced by other different ones within the set of class labels.
An object of class ndmodel
with elements:
xnoise |
a data frame with the noisy input attributes. |
ynoise |
a factor vector with the noisy output class. |
numnoise |
an integer vector with the amount of noisy samples per class. |
idnoise |
an integer vector list with the indices of noisy samples. |
numclean |
an integer vector with the amount of clean samples per class. |
idclean |
an integer vector list with the indices of clean samples. |
distr |
an integer vector with the samples per class in the original data. |
model |
the full name of the noise introduction model used. |
param |
a list of the argument values. |
call |
the function call. |
Noise model adapted from the papers in References.
A. Ghosh, N. Manwani, and P. S. Sastry. Making risk minimization tolerant to label noise. Neurocomputing, 160:93-107, 2015. doi: 10.1016/j.neucom.2014.09.081.
exps_cuni_ln
, print.ndmodel
, summary.ndmodel
, plot.ndmodel
# load the dataset data(iris2D) # usage of the default method set.seed(9) outdef <- qua_uni_ln(x = iris2D[,-ncol(iris2D)], y = iris2D[,ncol(iris2D)], level = c(0.05, 0.15, 0.20, 0.4)) # show results summary(outdef, showid = TRUE) plot(outdef) # usage of the method for class formula set.seed(9) outfrm <- qua_uni_ln(formula = Species ~ ., data = iris2D, level = c(0.05, 0.15, 0.20, 0.4)) # check the match of noisy indices identical(outdef$idnoise, outfrm$idnoise)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.