wracog: Wrapper for rapidly converging Gibbs algorithm.
In ncordon/imbalance: Preprocessing Algorithms for Imbalanced Datasets

Description Usage Arguments Details Value References Examples

Generates synthetic minority examples by approximating their probability distribution until sensitivity of wrapper over validation cannot be further improved. Works only on discrete numeric datasets.

wracog(
  train,
  validation,
  wrapper,
  slideWin = 10,
  threshold = 0.02,
  classAttr = "Class",
  ...
)

`train`	`data.frame`. A initial dataset to generate first model. All columns, except `classAttr` one, have to be numeric or coercible to numeric.
`validation`	`data.frame`. A dataset to compare results of consecutive classifiers. Must have the same structure of `train`.
`wrapper`	An `S3` object. There must exist a method `trainWrapper` implemented for the class of the object, and a `predict` method implemented for the class of the model returned by `trainWrapper`. Alternatively, it can the name of one of the wrappers distributed with the package, `"KNN"` or `"C5.0"`.
`slideWin`	Number of last sensitivities to take into account to meet the stopping criteria. By default, 10.
`threshold`	Threshold that the last `slideWin` sensitivities mean should reach. By default, 0.02.
`classAttr`	`character`. Indicates the class attribute from `train` and `validation`. Must exist in them.
`...`	further arguments for `wrapper`.

Until the last slideWin executions of wrapper over validation dataset reach a mean sensitivity lower than threshold, the algorithm keeps generating samples using Gibbs Sampler, and adding misclassified samples with respect to a model generated by a former train, to the train dataset. Initial model is built on initial train.

A data.frame with the same structure as train, containing the generated synthetic examples.

Das, Barnan; Krishnan, Narayanan C.; Cook, Diane J. Racog and Wracog: Two Probabilistic Oversampling Techniques. IEEE Transactions on Knowledge and Data Engineering 27(2015), Nr. 1, p. 222–234.

data(haberman)

# Create train and validation partitions of haberman
trainFold <- sample(1:nrow(haberman), nrow(haberman)/2, FALSE)
trainSet <- haberman[trainFold, ]
validationSet <- haberman[-trainFold, ]

# Defines our own wrapper with a C5.0 tree
myWrapper <- structure(list(), class="TestWrapper")
trainWrapper.TestWrapper <- function(wrapper, train, trainClass){
  C50::C5.0(train, trainClass)
}

# Execute wRACOG with our own wrapper
newSamples <- wracog(trainSet, validationSet, myWrapper,
                     classAttr = "Class")


# Execute wRACOG with predifined wrappers for "KNN" or "C5.0"
KNNSamples <- wracog(trainSet, validationSet, "KNN")
C50Samples <- wracog(trainSet, validationSet, "C5.0")