sample_generator: The function to generate 2-dimensional dataset

View source: R/sample_generator.R

sample_generatorR Documentation

The function to generate 2-dimensional dataset

Description

The function to generate 2-dimensional dataset given the number of instances and the ratio between the number of negative instances to total instances. The positive instances will be distributed uniformly as the circle in the center while negative instances are around over the domain. The random positive outcasts are also generated. The dataset is used to show the difference between datasets generated by each sampling technique.

Usage

sample_generator(n, ratio = 0.8, xlim = c(0, 1), ylim = c(0, 1),
   radius = 0.25, overlap = -0.05, outcast_ratio = 0.01)

Arguments

n

The number of instances in the dataset

ratio

The ratio of negative instances to the total number of instances

xlim

The range of values in the first dimension

ylim

The range of values in the second dimension

radius

The radius of the circle of positive instances

overlap

The gap between the set of positive and negative instances

outcast_ratio

The ratio of outcast to be generate in this dataset.

Value

A 2-dimensional dataset with the 3rd column as its target class vector.

Author(s)

Wacharasak Siriseriwan <wacharasak.s@gmail.com>

Examples

	data_example = sample_generator(5000,ratio = 0.80)
	plot(data_example[data_example[,3]=="n",1],
	data_example[data_example[,3]=="n",2],col="yellow")
	points(data_example[data_example[,3]=="p",1],
	data_example[data_example[,3]=="p",2],col="red",pch=14)

smotefamily documentation built on May 29, 2024, 7:54 a.m.