Description Usage Arguments Details Value
generate_imbalanced_data
is a simple function to generate a
two-class imbalanced data set.
1 2 | generate_imbalanced_data(num_examples = 100L, num_features = 2L,
imbalance_ratio = 5, noise_maj = 0.05, noise_min = 0.1, seed = NULL)
|
num_examples |
Total number of examples in the data set. |
num_features |
Total number of features in the data set. |
imbalance_ratio |
Ratio of the number of examples in the majority class to the number of examples in the minority class. |
noise_maj |
Fraction of the minority class that is mislabelled as majority class. |
noise_min |
Fraction of the majority class that is mislabelled as minority class. |
seed |
Integer value for reproducibility purposes. |
The imbalanced data set generated has two classes where the majority class comes from a multivariate normal distribution with mean zero and unitary standard deviation for all features and the minority class comes from a multivariate normal distribution with mean two and unitary standard deviation for all features.
The total number of examples and the dimensionality of the data are chosen
through the num_examples
and num_features
arguments. The
imbalance_ratio
argument together with num_examples
determines the exact number of examples in the majority and minority
classes. To simulate noise in the data, approximately noise_min
examples in the majority class are labelled as minority class examples and
approximately noise_maj
examples in the minority class are labelled
as majority class examples. noise_maj
and noise_min
are
fractions.
A data frame containing an imbalanced two-class data set.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.