View source: R/partition_and_normalize.R
partition_and_normalize | R Documentation |
Function that processes the input data splitting it into training and test sets and normalizes the outputs depending on the best instance performance. The user can bypass the partition into training and test set by passing the parameters x.test
and y.test
.
partition_and_normalize(
x,
y,
x.test = NULL,
y.test = NULL,
family_column = NULL,
split_by_family = FALSE,
test_size = 0.3,
better_smaller = TRUE
)
x |
dataframe with the instances (rows) and its features (columns). It may also include a column with the family data. |
y |
dataframe with the instances (rows) and the corresponding output (KPI) for each algorithm (columns). |
x.test |
dataframe with the test features. It may also include a column with the family data. If NULL the algorithm will split x into training and test sets. |
y.test |
dataframe with the test outputs. If NULL the algorithm will y into training and test sets. |
family_column |
column number of x where each instance family is indicated. If given, aditional options for the training and set test splitting and the graphics are enabled. |
split_by_family |
boolean indicating if we want to split sets keeping family proportions in case x.test and y.test are NULL. This option requires that option |
test_size |
float with the segmentation proportion for the test dataframe. It must be a value between 0 and 1. Only needed when |
better_smaller |
boolean that indicates wether the output (KPI) is better if smaller (TRUE) or larger (FALSE). |
A list is returned of class as_data
containing:
x.train
A data frame with the training features.
y.train
A data frame with the training output.
x.test
A data frame with the test features.
y.test
A data frame with the test output.
y.train.original
A vector with the original training output (without normalizing).
y.test.original
A vector with the original test output (without normalizing).
families.train
A data frame with the families of the training data.
families.test
A data frame with the families of the test data.
data(branching)
data_obj <- partition_and_normalize(branching$x, branching$y, test_size = 0.3,
family_column = 1, split_by_family = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.