artificial.data: Creates artificial dataset

View source: R/rmcfs.R

artificial.dataR Documentation

Creates artificial dataset

Description

Creates data.frame with artificial data. The last six columns are nominal and highly correlated to feature 'class'. This data set consists of objects from 3 classes, A, B and C, that contain 40, 20, 10 objects, respectively (70 objects altogether). For each object, 6 binary features (A1, A2, B1, B2, C1 and C2) are created and they are 'ideally' or 'almost ideally' correlated with class feature. If an object's 'class' equals 'A', then its features A1 and A2 are set to class value 'A'; otherwise A1 = A2 = 0. If an object's 'class' is 'B' or 'C', the processing is analogous, but some random corruption is introduced. For 2 observations from class 'B' and both attributes B1/B2, their values 'B' are replaced by '0'. For 4 observations from class 'C' and both attributes C1/C2, their values 'C' are replaced by '0'. The number of corrupted values for each class is defined by corruption parameter. The data also contains additional rnd_features = 500 random numerical features with uniformly [0,1] distributed values.

Usage

artificial.data(rnd_features = 500, size = c(40, 20, 10), 
                        corruption = c(0, 2, 4), seed = NA)

Arguments

rnd_features

number of numerical random features.

size

size of classes A, B, and C.

corruption

defines the number of corrupted values for a pairs of columns A1/A2, B1/B2, C1/C2,

seed

seed for random number generator.

Value

data.frame with six important features.

Examples

  d <- artificial.data(rnd_features = 500)
  showme(d)

rmcfs documentation built on Sept. 11, 2024, 8:41 p.m.

Related to artificial.data in rmcfs...