DatRel | R Documentation |
DatRel
relocates resampled data using Pure and Proper Class Cover Catch Digraph
DatRel(x, y, x_syn, proportion = 1, p_of = 0, class_pos = NULL)
x |
feature matrix or dataframe. |
y |
class factor variable. |
x_syn |
synthetic data generated by an oversampling method. |
proportion |
proportion of covered samples. A real number between |
p_of |
proportion to increase cover radius. A real number between
|
class_pos |
Class name of synthetic data. Default is NULL. If NULL, positive class is minority class. |
Calculates cover areas using pure and proper class cover catch digraphs (PCCCD) for
original dataset. Any sample outside of cover area is relocated towards a
specific dominant point. Determination of dominant point to move towards is
based on distance based on radii of PCCCD balls. p_of
is to increase
obtained radii to be more tolerant to noise. prooportion
argument is
cover percentage for PCCCD to stop when desired percentage is covered for
each class. PCCCD models are determined using rcccd
package.
class_pos
argument is used to specify oversampled class.
an list object which includes:
x_new |
Oversampled and relocated feature matrix |
y_new |
Oversampled class variable |
x_syn |
Generated and relocated sample matrix |
i_dominant |
Indexes of dominant samples |
x_pos_dominant |
Dominant samples for positive class |
radii_pos_dominant |
Positive class cover percentage |
Fatih Saglam, saglamf89@gmail.com
library(SMOTEWB)
library(rcccd)
set.seed(10)
# adding data
x <- rbind(matrix(rnorm(2000, 3, 1), ncol = 2, nrow = 1000),
matrix(rnorm(60, 6, 1), ncol = 2, nrow = 30))
y <- as.factor(c(rep("negative", 1000), rep("positive", 30)))
# adding noise
x[1001,] <- c(3,3)
x[1002,] <- c(2,2)
x[1003,] <- c(4,4)
# resampling
m_SMOTE <- SMOTE(x = x, y = y, k = 3)
# relocation of resampled data
m_DatRel <- DatRel(x = x, y = y, x_syn = m_SMOTE$x_syn)
# resampled data
plot(x, col = y, main = "SMOTE")
points(m_SMOTE$x_syn, col = "green")
# resampled data after relocation
plot(x, col = y, main = "SMOTE + DatRel")
points(m_DatRel$x_syn, col = "green")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.