ml: Machine learning process

View source: R/machine_learning.R

mlR Documentation

Machine learning process

Description

Function that proceses input data, trains the machine learning models, makes a prediction and plots the results.

Usage

ml(
  x,
  y,
  x.test = NULL,
  y.test = NULL,
  family_column = NULL,
  split_by_family = FALSE,
  predict = TRUE,
  test_size = 0.25,
  better_smaller = TRUE,
  method = "ranger",
  test = TRUE,
  color_list = NULL
)

Arguments

x

dataframe with the instances (rows) and its features (columns). It may also include a column with the family data.

y

dataframe with the instances (rows) and the corresponding output (KPI) for each algorithm (columns).

x.test

dataframe with the test features. It may also include a column with the family data. If NULL, the algorithm will split x into training and test sets.

y.test

dataframe with the test outputs. If NULL, the algorithm will split y into training and test sets.

family_column

column number of x where each instance family is indicated. If given, aditional options for the training and set test splitting and the graphics are enabled.

split_by_family

boolean indicating if we want to split sets keeping family proportions in case x.test and y.test are NULL. This option requires that option family_column is different from NULL

predict

boolean indicating if predictions will be made or not. If FALSE plots will use training data only and no ML column will be displayed.

test_size

float with the segmentation proportion for the test dataframe. It must be a value between 0 and 1.

better_smaller

boolean that indicates wether the output (KPI) is better if smaller (TRUE) or larger (FALSE).

method

name of the model to be used. The user can choose from any of the models provided by caret. See http://topepo.github.io/caret/train-models-by-tag.html for more information about the models supported.

test

boolean indicating whether the predictions will be made with the test set or the training set.

color_list

list with the colors for the plots. If NULL or insufficient number of colors, the colors will be generated automatically.

Value

A list with the data and plots generated, including:

  • data_obj An as_data object with the processed data from partition_and_normalize() function.

  • training An as_train object with the trainings from the AStrain() function.

  • predictions A data frame with the predictions from the ASpredict() function, if the predict param is TRUE.

  • table A table with the summary of the output data.

  • boxplot, ranking_plot, figure_comparison, optml_figure_comparison and optmlall_figure_comparison with the corresponding plots.

Examples


data(branchingsmall)
machine_learning <- ml(branchingsmall$x, branchingsmall$y, test_size = 0.3,
family_column = 1, split_by_family = TRUE, method = "glm")


ASML documentation built on April 3, 2025, 8:47 p.m.

Related to ml in ASML...