hacide: Half circle filled data

Description Usage Format References Examples

Description

Simulated training and test set for imbalanced binary classification. The rare class may be described as a half circle depleted filled with the prevalent class, which is normally distributed and has elliptical contours.

Usage

1
data(hacide)

Format

Data represent 2 real features (denoted as x1, x2) and a binary label class (denoted as cls). Positive examples occur in about 2% of the data.

hacide.train

Includes 1000 rows and 20 positive examples.

hacide.test

Includes 250 rows and 5 positive examples.

Data have been simulated as follows:

-

if cls = 0 then (x1, x2)\sim \mathbf{N}_{2} ≤ft(\mathbf{0}_{2}, (1/4, 1) \mathbf{I}_{2}\right)

-

if cls = 1 then (x1, x2)\sim \mathbf{N}_{2} ≤ft(\mathbf{0}_{2}, \mathbf{I}_{2}\right) \cap ≤ft\|\mathbf{x}\right\|^2>4 \cap x_2 ≤q 0

References

Lunardon, N., Menardi, G., and Torelli, N. (2014). ROSE: a Package for Binary Imbalanced Learning. R Jorunal, 6:82–92.

Menardi, G. and Torelli, N. (2014). Training and assessing classification rules with imbalanced data. Data Mining and Knowledge Discovery, 28:92–122.

Examples

1
2
3

Example output

Loaded ROSE 0.0-3

 cls           x1                 x2          
 0:980   Min.   :-3.73468   Min.   :-3.17886  
 1: 20   1st Qu.:-0.39539   1st Qu.:-0.78564  
         Median :-0.03025   Median :-0.06871  
         Mean   :-0.03185   Mean   :-0.06603  
         3rd Qu.: 0.35474   3rd Qu.: 0.69454  
         Max.   : 1.98859   Max.   : 3.03422  
 cls           x1                 x2          
 0:245   Min.   :-2.12655   Min.   :-2.84904  
 1:  5   1st Qu.:-0.32244   1st Qu.:-0.57730  
         Median : 0.04004   Median : 0.10856  
         Mean   : 0.02918   Mean   : 0.09874  
         3rd Qu.: 0.37115   3rd Qu.: 0.82948  
         Max.   : 2.15575   Max.   : 4.36886  

ROSE documentation built on May 29, 2017, 8:43 p.m.