optimizer_nadam: Nesterov Adam optimizer

View source: R/kerasOptimizer.R

optimizer_nadamR Documentation

Nesterov Adam optimizer

Description

Much like Adam is essentially RMSprop with momentum, Nadam is Adam RMSprop with Nesterov momentum.

Usage

optimizer_nadam(
  learning_rate = 0.002,
  beta_1 = 0.9,
  beta_2 = 0.999,
  epsilon = NULL,
  schedule_decay = 0.004,
  clipnorm = NULL,
  clipvalue = NULL,
  ...
)

Arguments

learning_rate

float >= 0. Learning rate.

beta_1

The exponential decay rate for the 1st moment estimates. float, 0 < beta < 1. Generally close to 1.

beta_2

The exponential decay rate for the 2nd moment estimates. float, 0 < beta < 1. Generally close to 1.

epsilon

float >= 0. Fuzz factor. If 'NULL', defaults to 'k_epsilon()'.

schedule_decay

Schedule deacy.

clipnorm

Gradients will be clipped when their L2 norm exceeds this value.

clipvalue

Gradients will be clipped when their absolute value exceeds this value.

...

Unused, present only for backwards compatability

Details

Default parameters follow those provided in the paper.

Note

To enable compatibility with the ranges of the learning rates of the other optimizers, the learning rate learning_rate is internally mapped to 2 * learning_rate. That is, a learning rat of 0.001 will be mapped to 0.002 (which is the default.)

See Also

[On the importance of initialization and momentum in deep learning](https://www.cs.toronto.edu/~fritz/absps/momentum.pdf).

Other optimizers: optimizer_adadelta(), optimizer_adagrad(), optimizer_adamax(), optimizer_adam(), optimizer_rmsprop(), optimizer_sgd()


SPOTMisc documentation built on Sept. 5, 2022, 5:06 p.m.