Description Usage Arguments Value See Also Examples
View source: R/data_balancing_funcs.R
Creates possibly balanced samples by random over-sampling minority examples, under-sampling majority examples or combination of over- and under-sampling.
1 2 3 |
formula |
An object of class |
data |
An optional data frame, list or environment (or object
coercible to a data frame by |
method |
One among |
N |
The desired sample size of the resulting data set.
If missing and |
p |
The probability of resampling from the rare class.
If missing and |
subset |
An optional vector specifying a subset of observations to be used in the sampling process.
The default is set by the |
na.action |
A function which indicates what should happen when the data contain 'NA's.
The default is set by the |
seed |
A single value, interpreted as an integer, recommended to specify seeds and keep trace of the sample. |
The value is an object of class ovun.sample
which has components
Call |
The matched call. |
method |
The method used to balance the sample. Possible choices are |
data |
The resulting new data set. |
ROSE
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | # 2-dimensional example
# loading data
data(hacide)
# imbalance on training set
table(hacide.train$cls)
# balanced data set with both over and under sampling
data.balanced.ou <- ovun.sample(cls~., data=hacide.train,
N=nrow(hacide.train), p=0.5,
seed=1, method="both")$data
table(data.balanced.ou$cls)
# balanced data set with over-sampling
data.balanced.over <- ovun.sample(cls~., data=hacide.train,
p=0.5, seed=1,
method="over")$data
table(data.balanced.over$cls)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.