artificial.data | R Documentation |
Creates data.frame
with artificial data. The last six columns are nominal and highly correlated to feature 'class'. This data set consists of objects from 3 classes, A, B and C, that contain 40, 20, 10 objects, respectively (70 objects altogether). For each object, 6 binary features (A1, A2, B1, B2, C1 and C2) are created and they are 'ideally' or 'almost ideally' correlated with class feature. If an object's 'class' equals 'A', then its features A1 and A2 are set to class value 'A'; otherwise A1 = A2 = 0. If an object's 'class' is 'B' or 'C', the processing is analogous, but some random corruption is introduced. For 2 observations from class 'B' and both attributes B1/B2, their values 'B' are replaced by '0'. For 4 observations from class 'C' and both attributes C1/C2, their values 'C' are replaced by '0'. The number of corrupted values for each class is defined by corruption
parameter. The data also contains additional rnd_features = 500
random numerical features with uniformly [0,1] distributed values.
artificial.data(rnd_features = 500, size = c(40, 20, 10),
corruption = c(0, 2, 4), seed = NA)
rnd_features |
number of numerical random features. |
size |
size of classes A, B, and C. |
corruption |
defines the number of corrupted values for a pairs of columns A1/A2, B1/B2, C1/C2, |
seed |
seed for random number generator. |
data.frame with six important features.
d <- artificial.data(rnd_features = 500)
showme(d)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.