# boud_gau_an: Boundary/dependent Gaussian attribute noise In noisemodel: Noise Models for Classification Datasets

 boud_gau_an R Documentation

## Boundary/dependent Gaussian attribute noise

### Description

Introduction of Boundary/dependent Gaussian attribute noise into a classification dataset.

### Usage

```## Default S3 method:
boud_gau_an(x, y, level, k = 0.2, sortid = TRUE, ...)

## S3 method for class 'formula'
boud_gau_an(formula, data, ...)
```

### Arguments

 `x` a data frame of input attributes. `y` a factor vector with the output class of each sample. `level` a double in [0,1] with the noise level to be introduced. `k` a double in [0,1] with the scale used for the standard deviation (default: 0.2). `sortid` a logical indicating if the indices must be sorted at the output (default: `TRUE`). `...` other options to pass to the function. `formula` a formula with the output class and, at least, one input attribute. `data` a data frame in which to interpret the variables in the formula.

### Details

Boundary/dependent Gaussian attribute noise corrupts (`level`·100)% samples among the ((`level`+0.1)·100)% of samples closest to the decision boundary. Their attribute values are corrupted by adding a random number that follows a Gaussian distribution of mean = 0 and standard deviation = (max-min)·`k`, being max and min the limits of the attribute domain. For nominal attributes, a random value is chosen.

### Value

An object of class `ndmodel` with elements:

 `xnoise` a data frame with the noisy input attributes. `ynoise` a factor vector with the noisy output class. `numnoise` an integer vector with the amount of noisy samples per attribute. `idnoise` an integer vector list with the indices of noisy samples per attribute. `numclean` an integer vector with the amount of clean samples per attribute. `idclean` an integer vector list with the indices of clean samples per attribute. `distr` an integer vector with the samples per class in the original data. `model` the full name of the noise introduction model used. `param` a list of the argument values. `call` the function call.

### Note

Noise model adapted from the papers in References.

### References

J. Bi and T. Zhang. Support vector classification with input data uncertainty. In Advances in Neural Information Processing Systems, volume 17, pages 161-168, 2004. url:https://proceedings.neurips.cc/paper/2004/hash/22b1f2e0983160db6f7bb9f62f4dbb39-Abstract.html.

### See Also

`imp_int_an`, `asy_int_an`, `print.ndmodel`, `summary.ndmodel`, `plot.ndmodel`

### Examples

```# load the dataset
data(iris2D)

# usage of the default method
set.seed(9)
outdef <- boud_gau_an(x = iris2D[,-ncol(iris2D)], y = iris2D[,ncol(iris2D)], level = 0.1)

# show results
summary(outdef, showid = TRUE)
plot(outdef)

# usage of the method for class formula
set.seed(9)
outfrm <- boud_gau_an(formula = Species ~ ., data = iris2D, level = 0.1)

# check the match of noisy indices
identical(outdef\$idnoise, outfrm\$idnoise)

```

noisemodel documentation built on Oct. 17, 2022, 9:05 a.m.