racog: Rapidly converging Gibbs algorithm.

Description Usage Arguments Details Value References Examples

View source: R/racog.R

Description

Allows you to treat imbalanced discrete numeric datasets by generating synthetic minority examples, approximating their probability distribution.

Usage

1
racog(dataset, numInstances, burnin = 100, lag = 20, classAttr = "Class")

Arguments

dataset

data.frame to treat. All columns, except classAttr one, have to be numeric or coercible to numeric.

numInstances

Integer. Number of new minority examples to generate.

burnin

Integer. It determines how many examples generated for a given one are going to be discarded firstly. By default, 100.

lag

Integer. Number of iterations between new generated example for a minority one. By default, 20.

classAttr

character. Indicates the class attribute from dataset. Must exist in it.

Details

Approximates minority distribution using Gibbs Sampler. Dataset must be discretized and numeric. In each iteration, it builds a new sample using a Markov chain. It discards first burnin iterations, and from then on, each lag iterations, it validates the example as a new minority example. It generates d (iterations-burnin)/lag where d is minority examples number.

Value

A data.frame with the same structure as dataset, containing the generated synthetic examples.

References

Das, Barnan; Krishnan, Narayanan C.; Cook, Diane J. Racog and Wracog: Two Probabilistic Oversampling Techniques. IEEE Transactions on Knowledge and Data Engineering 27(2015), Nr. 1, p. 222<e2><80><93>234.

Examples

1
2
3
4
5
6
7
8
data(iris0)

# Generates new minority examples

newSamples <- racog(iris0, numInstances = 40, burnin = 20, lag = 10,
                    classAttr = "Class")

newSamples <- racog(iris0, numInstances = 100)

ncordon/imbalance documentation built on Feb. 19, 2018, 7:08 a.m.