Description Usage Arguments Details Differences to Python implementation Examples
This callback implements a cyclical learning rate policy (CLR). The method cycles the learning rate between two boundaries with some constant frequency, as detailed in this paper. In addition, the call-back supports scaled learning-rate bandwidths (see section 'Differences to the Python implementation'). Note that this callback is very general as it can be used to specify:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
base_lr |
Initial learning rate which is the lower boundary in the cycle. |
max_lr |
Upper boundary in the cycle. Functionally, it defines the
cycle amplitude ( |
step_size |
Number of training iterations per half cycle. Authors
suggest setting step_size |
mode |
One of "triangular", "triangular2" or "exp_range". Default
"triangular". Values correspond to policies detailed above. If |
gamma |
Constant in |
scale_fn |
Custom scaling policy defined by a single argument anonymous
function, where |
scale_mode |
Either "cycle" or "iterations". Defines whether |
patience |
The number of epochs of training without validation loss
improvement that the callback will wait before it adjusts |
factor |
An numeric vector of lenght one which will scale |
decrease_base_lr |
Boolean indicating whether |
cooldown |
Number of epochs to wait before resuming normal operation after learning rate has been reduced. |
verbose |
Currently supporting 0 (silent) and 1 (verbose). |
constant learning rates.
cyclical learning rates.
decayling learning rates depending on validation loss such as
keras::callback_reduce_lr_on_plateau()
learning rates with scaled bandwidths. Apart from this, the implementation follows the Python implementation quite closey.
The amplitude of the cycle can be scaled on a per-iteration or per-cycle basis. This class has three built-in policies, as put forth in the paper.
"triangular": A basic triangular cycle w/ no amplitude scaling.
"triangular2": A basic triangular cycle that scales initial amplitude by half each cycle.
"exp_range": A cycle that scales initial amplitude by gamma**(cycle iterations) at each cycle iteration.
For more details, please see paper.
This implementation differs from the Python implementation in the following aspects:
scaled learning-rate bandwidth on plateau is supported. Via the
arguments patience
, factor
and decrease_base_lr
, the user has
control over if and when the boundaries of the learning rate are adjusted.
This feature allows to combine decaying learning rates with cyclical
learning rates. Typically, one wants to reduce the learning rate bandwith
after validation loss has stopped improving for some time.
Note that both factor < 1
and patience < Inf
must hold
in order for this feature to take effect.
The history
dataframe in the return value of this callback has a column
epochs
in addition to itterations
and lr
.
The callback returns a history_epoch
dataframe that just contains the
epochs and the learning rates at the end of the epoch. This is less
granular than the history
element.
All column names in history
and history_epoch
are - opposed to the
Python implementation - in singular.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | library(keras)
dataset <- dataset_boston_housing()
c(c(train_data, train_targets), c(test_data, test_targets)) %<-% dataset
mean <- apply(train_data, 2, mean)
std <- apply(train_data, 2, sd)
train_data <- scale(train_data, center = mean, scale = std)
test_data <- scale(test_data, center = mean, scale = std)
model <- keras_model_sequential() %>%
layer_dense(
units = 64, activation = "relu",
input_shape = dim(train_data)[[2]]
) %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 1)
model %>% compile(
optimizer = optimizer_rmsprop(lr = 0.001),
loss = "mse",
metrics = c("mae")
)
callback_clr <- new_callback_cyclical_learning_rate(
step_size = 32,
base_lr = 0.001,
max_lr = 0.006,
gamma = 0.99,
mode = "exp_range"
)
model %>% fit(
train_data, train_targets,
validation_data = list(test_data, test_targets),
epochs = 10, verbose = 1,
callbacks = list(callback_clr)
)
callback_clr$history
plot_clr_history(callback_clr, backend = "base")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.