View source: R/deep_learning.R
deep_learning | R Documentation |
deep_learning()
is a wrapper of the keras::keras_model_sequential()
function to fit a deep learning model with the ability to tune the
hyperparameters with grid search or bayesian optimization in a simple way.
You can fit univariate and multivariate models for numeric and/or
categorical response variables.
All the parameters marked as (tunable) accept a vector of values with
wich the grid is generated for grid search tuning or a list with the min
and max values for bayesian optimization tuning. The returned object contains
a data.frame
with the hyperparameters combinations evaluated. In the end
the best combination of hyperparameters is used to fit the final model, which
is also returned and can be used to make new predictions.
deep_learning(
x,
y,
learning_rate = 0.001,
epochs_number = 500,
batch_size = 32,
layers = list(list(neurons_number = 50, neurons_proportion = NULL, activation = "relu",
dropout = 0, ridge_penalty = 0, lasso_penalty = 0)),
output_penalties = list(ridge_penalty = 0, lasso_penalty = 0),
tune_type = "Grid_search",
tune_cv_type = "K_fold",
tune_folds_number = 5,
tune_testing_proportion = 0.2,
tune_folds = NULL,
tune_grid_proportion = 1,
tune_bayes_samples_number = 10,
tune_bayes_iterations_number = 10,
optimizer = "adam",
loss_function = NULL,
with_platt_scaling = FALSE,
platt_proportion = 0.3,
shuffle = TRUE,
early_stop = FALSE,
early_stop_patience = 50,
validate_params = TRUE,
seed = NULL,
verbose = TRUE
)
x |
( |
y |
( |
learning_rate |
( |
epochs_number |
( |
batch_size |
( |
layers |
(
You can provide as many
|
output_penalties |
(
You do not have to provide the two values, if one of them is not provided the
default value is used. By default the next
|
tune_type |
( |
tune_cv_type |
( |
tune_folds_number |
( |
tune_testing_proportion |
( |
tune_folds |
( |
tune_grid_proportion |
( |
tune_bayes_samples_number |
( |
tune_bayes_iterations_number |
( |
optimizer |
( |
loss_function |
( |
with_platt_scaling |
( |
platt_proportion |
( |
shuffle |
( |
early_stop |
( |
early_stop_patience |
( |
validate_params |
( |
seed |
( |
verbose |
( |
tune_loss_function |
( |
You have to consider that before tuning all columns without variance
(where all the records has the same value) are removed. Such columns
positions are returned in the removed_x_cols
field of the returned object.
All records with missing values (NA
), either in x
or in y
will be
removed. The positions of the removed records are returned in the
removed_rows
field of the returned object.
The general tuning algorithm works as follows:
For grid search tuning, the hyperparameters grid is generated (step one in
the algorithm) with the cartesian product of all the provided values (all the
posible combinations) in all tunable parameters. If only one value of
each tunable parameter is provided no tuning is done.
tune_grid_proportion
allows you to specify the proportion of all
combinations you want to sample from the full grid and tune them, by default
all combinations are evaluated.
For bayesian optimization tuning, step one in the algorithm works a little
different. At start, tune_bayes_samples_number
different
hyperparameters combinations are generated and evaluated, then
tune_bayes_iterations_number
new hyperparameters combinations are generated
and evaluated iteratively based on the bayesian optimization algorithm, but
this process is equivalent to that described in the general tuninig
algorithm. Note that only the hyperparameters for which the list of min and
max values were provided are tuned and their values fall in the specified
boundaries.
Important: Unlike the other models, when tuning deep learning models
steps 6 and 7 are omited in the algorithm, instead train
and test
datasets are sent to keras
, the first one to fit the model and the second
one to compute the loss function at the end of each epoch, so at the end, the
saved value in step 8 is the validation loss value returned by keras
in the
last epoch. tune_loss_function
parameter cannot be used in deep_learning
function since the same loss function evaluated at each epoch and specified
in loss_function
parameter is used for tuning too.
By default this function selects the activation function and the number of
neurons for the last layer of the model based on the response variable(s)
type(s). For continuous responses the "linear"
(identity) activation
function is used with one neuron, for count responses the "exponential"
with one neuron, for binary responses the "sigmoid"
with one neuron and for
categorical responses "softmax"
with as many neurons as number of
categories.
The available options of the loss_function
parameter are:
Probabilistic losses
"binary_crossentropy"
"categorical_crossentropy"
"sparse_categorical_crossentropy"
"poisson"
"kl_divergence"
Regression losses
"mean_squared_error"
"mean_absolute_error"
"mean_absolute_percentage_error"
"mean_squared_logarithmic_error"
"cosine_similarity"
"huber"
"log_cosh"
Hinge losses for "maximum-margin" classification
"hinge"
"squared_hinge"
"categorical_hinge"
It is a way of improving the training process of deep learning models that uses a calibration based on a model that is already trained and applied via a post-processing operation.
After tuninig, Platt scaling calibration divides the dataset into Training
and Calibration
datasets, then it uses Training
to fit the deep learning
model with the best hyperparameters combination and with this model computes
the predictions of the Calibration
dataset. Finally with the predicted and
true values, a linear model is fitted (observed in function of predicted),
this linear model corresponds to the calibration and when a new prediction is
going to be made, first the deep learning model is used and the resulting
predicted value is calibrated with the linear model.
Note that Platt scaling calibration only works for numeric and binary response variables of univariate models.
An object of class "DeepLearningModel"
that inherits from classes
"Model"
and "R6"
with the fields:
fitted_model
: An object of class keras::keras_model_sequential()
with the model.
x
: The final matrix
used to fit the model.
y
: The final vector
or matrix
used to fit the model.
hyperparams_grid
: A data.frame
with all the computed combinations of
hyperparameters and with one more column called "loss"
with the value of
the loss function for each combination. The data is ordered with the best
combinations at start, sometimes with the lowest values first and other
times with the greatest values first, depending the loss function.
best_hyperparams
: A list
with the combination of hyperparameters with
the best loss value (the first row in hyperparams_grid
).
execution_time
: A difftime
object with the total time taken to tune and
fit the model.
removed_rows
: A numeric
vector with the records' indices (in the
provided position) that were deleted and not taken in account in tunning
nor training.
removed_x_cols
: A numeric
vector with the columns' indices (in the
provided positions) that were deleted and not taken in account in tunning
nor training.
...
: Some other parameters for internal use.
predict.Model()
Other models:
bayesian_model()
,
generalized_boosted_machine()
,
generalized_linear_model()
,
mixed_model()
,
partial_least_squares()
,
random_forest()
,
support_vector_machine()
# Use all default hyperparameters (no tuning) -------------------------------
x <- to_matrix(iris[, -5])
y <- iris$Species
model <- deep_learning(x, y)
# Predict using the fitted model
predictions <- predict(model, x)
# Obtain the predicted values
predictions$predicted
# Obtain the predicted probabilities
predictions$probabilities
# Tune with grid search -----------------------------------------------------
x <- to_matrix(iris[, -1])
y <- iris$Sepal.Length
model <- deep_learning(
x,
y,
epochs_number = c(10, 20),
learning_rate = c(0.001, 0.01),
layers = list(
# First hidden layer
list(neurons_number = c(10, 20)),
# Second hidden layer
list(neurons_number = c(10))
),
tune_type = "grid_search",
tune_cv_type = "k_fold",
tune_folds_number = 5
)
# Obtain the whole grid with the loss values
model$hyperparams_grid
# Obtain the hyperparameters combination with the best loss value
model$best_hyperparams
# Predict using the fitted model
predictions <- predict(model, x)
# Obtain the predicted values
predictions$predicted
# Tune with Bayesian optimization -------------------------------------------
x <- to_matrix(iris[, -1])
y <- iris$Sepal.Length
model <- deep_learning(
x,
y,
epochs_number = list(min = 10, max = 50),
learning_rate = list(min = 0.001, max = 0.5),
layers = list(
list(
neurons_number = list(min = 10, max = 20),
dropout = list(min = 0, max = 1),
activation_layer = "sigmoid"
)
),
tune_type = "bayesian_optimization",
tune_bayes_samples_number = 5,
tune_bayes_iterations_number = 5,
tune_cv_type = "random",
tune_folds_number = 2
)
# Obtain the whole grid with the loss values
model$hyperparams_grid
# Obtain the hyperparameters combination with the best loss value
model$best_hyperparams
# Predict using the fitted model
predictions <- predict(model, x)
# Obtain the predicted values
predictions$predicted
# Obtain the execution time taken to tune and fit the model
model$execution_time
# Multivariate analysis -----------------------------------------------------
x <- to_matrix(iris[, -c(1, 5)])
y <- iris[, c(1, 5)]
model <- deep_learning(
x,
y,
epochs_number = 10,
layers = list(
list(
neurons_number = 50,
dropout = 0.5,
activation = "relu",
ridge_penalty = 0.5,
lasso_penalty = 0.5
)
),
optimizer = "adadelta"
)
# Predict using the fitted model
predictions <- predict(model, x)
# Obtain the predicted values of the first response
predictions$Sepal.Length$predicted
# Obtain the predicted values and probabilities of the second response
predictions$Species$predicted
predictions$Species$probabilities
# Obtain the predictions in a data.frame not in a list
predictions <- predict(model, x, format = "data.frame")
head(predictions)
# With Platt scaling --------------------------------------------------------
x <- to_matrix(iris[, -1])
y <- iris$Sepal.Length
model <- deep_learning(
x,
y,
with_platt_scaling = TRUE,
platt_proportion = 0.25
)
# Predict using the fitted model
predictions <- predict(model, x)
# Obtain the predicted values
predictions$predicted
# Genomic selection ------------------------------------------------------------
data(Maize)
# Data preparation of G
Line <- model.matrix(~ 0 + Line, data = Maize$Pheno)
# Compute cholesky
Geno <- cholesky(Maize$Geno)
# G matrix
X <- Line %*% Geno
y <- Maize$Pheno$Y
# Set seed for reproducible results
set.seed(2022)
folds <- cv_kfold(records_number = nrow(X), k = 4)
Predictions <- data.frame()
Hyperparams <- data.frame()
# Model training and predictions
for (i in seq_along(folds)) {
cat("*** Fold:", i, "***\n")
fold <- folds[[i]]
# Identify the training and testing sets
X_training <- X[fold$training, ]
X_testing <- X[fold$testing, ]
y_training <- y[fold$training]
y_testing <- y[fold$testing]
# Model training
model <- deep_learning(
X_training,
y_training,
epochs_number = list(min = 50, max = 100),
learning_rate = list(min = 0.0001, max = 0.1),
layers = list(
list(
neurons_number = list(min = 2, max = 5),
activation = c("linear")
),
list(
neurons_number = list(min = 2, max = 10),
activation = c("linear")
)
),
tune_type = "Bayesian_Optimization",
tune_bayes_iterations_number = 5,
tune_bayes_samples_number = 5,
tune_cv_type = "k_fold",
tune_folds_number = 3
)
# Prediction of testing set
predictions <- predict(model, X_testing)
# Predictions for the i-th fold
FoldPredictions <- data.frame(
Fold = i,
Line = Maize$Pheno$Line[fold$testing],
Env = Maize$Pheno$Env[fold$testing],
Observed = y_testing,
Predicted = predictions$predicted
)
Predictions <- rbind(Predictions, FoldPredictions)
# Hyperparams
HyperparamsFold <- model$hyperparams_grid %>%
mutate(Fold = i)
Hyperparams <- rbind(Hyperparams, HyperparamsFold)
# Best hyperparams of the model
cat("*** Optimal hyperparameters: ***\n")
print(model$best_hyperparams)
}
head(Predictions)
# Compute the summary of all predictions
summaries <- gs_summaries(Predictions)
# Summaries by Line
head(summaries$line)
# Summaries by Environment
summaries$env
# Summaries by Fold
summaries$fold
# First rows of Hyperparams
head(Hyperparams)
# Last rows of Hyperparams
tail(Hyperparams)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.