new_callback_cyclical_learning_rate: Cyclical learning rate scheduler
In bradleyboehmke/clr: Cyclic Learning Rate

Description Usage Arguments Value References Examples

View source: R/clr_callback.R

This callback implements a cyclical learning rate policy (CLR) where the learning rate cycles between two boundaries with some constant frequency, as detailed in Smith (2017). In addition, supports scaled learning-rate bandwidths to automatically adjust the learing rate boundaries upon plateau.

new_callback_cyclical_learning_rate(base_lr = 0.001, max_lr = 0.006,
  step_size = 2000, mode = "triangular", gamma = 1,
  scale_fn = NULL, scale_mode = "cycle", patience = Inf,
  factor = 0.9, decrease_base_lr = TRUE, cooldown = 2)

`base_lr`	Numeric indicating initial learning rate to apply as the lower boundary in the cycle.
`max_lr`	Numeric indicating upper boundary in the cycle. Functionally, it defines the cycle amplitude (`max_lr - base_lr`). The learning rate at any cycle is the sum of `base_lr` and some scaling of the amplitude; therefore `max_lr` may not actually be reached depending on scaling function.
`step_size`	Integer inidicating the number of training iterations per half cycle. Authors suggest setting step_size 2-8 x training iterations in epoch.
`mode`	Character indicating one of the following options.. If 'scale_fn' is not 'NULL', this argument is ignored. "triangular": A basic triangular cycle with no amplitude scaling. "triangular2": A basic triangular cycle that scales initial amplitude by half each cycle. "exp_range": A cycle that scales initial amplitude by gamma^( cycle iterations) at each cycle iteration.
`gamma`	Numeric indicating the constant to apply when `mode = "exp_range"`. This scaling function applies gamma^(cycle iterations).
`scale_fn`	Custom scaling policy defined by a single argument anonymous function, where `0 <= scale_fn(x) <= 1` for all `x >= 0`. Mode paramater is ignored when applied. Default is `NULL`.
`scale_mode`	Character of "cycle" or "iterations". Defines whether `scale_fn` is evaluated on cycle number or cycle iterations (training iterations since start of cycle). Default is `"cycle"`.
`patience`	Integer indicating the number of epochs of training without validation loss improvement that the callback will wait before it adjusts `base_lr` and `max_lr`.
`factor`	Numeric vector of length one which will scale `max_lr` and (if applicable according to `decrease_base_lr`) `base_lr` after `patience` epochs without improvement in the validation loss.
`decrease_base_lr`	Boolean indicating whether `base_lr` should also be scaled with `factor` or not. Default is `TRUE`.
`cooldown`	Number of epochs to wait before resuming normal operation after learning rate has been reduced.

The callback object is a mutable R6 class of CyclicLR. This object will return two main data frames of interest:

history data frame: Contains loss and metric information along with the actual learning rate value for each iteration.
history_epoch data frame: Contains loss and metric information along with learning rate meta data for each epoch.

Smith, L.N. Cycical Learning Rates for Training Neural Networks. arXiv preprint arXiv:1506.01186 (2017). https://arxiv.org/abs/1506.01186

Lorenz Walthert (2020). KerasMisc: Add-ons for Keras. R package version 0.0.0.9001. https://github.com/lorenzwalthert/KerasMisc

library(keras)
dataset <- dataset_boston_housing()
c(c(train_data, train_targets), c(test_data, test_targets)) %<-% dataset

mean <- apply(train_data, 2, mean)
std <- apply(train_data, 2, sd)
train_data <- scale(train_data, center = mean, scale = std)
test_data <- scale(test_data, center = mean, scale = std)


model <- keras_model_sequential() %>%
  layer_dense(
    units = 64, activation = "relu",
    input_shape = dim(train_data)[[2]]
  ) %>%
  layer_dense(units = 64, activation = "relu") %>%
  layer_dense(units = 1)
model %>% compile(
  optimizer = optimizer_rmsprop(lr = 0.001),
  loss = "mse",
  metrics = c("mae")
)

callback_clr <- new_callback_cyclical_learning_rate(
  step_size = 32,
  base_lr = 0.001,
  max_lr = 0.006,
  gamma = 0.99,
  mode = "exp_range"
)
model %>% fit(
  train_data, train_targets,
  validation_data = list(test_data, test_targets),
  epochs = 10, verbose = 1,
  callbacks = list(callback_clr)
)
callback_clr$history