optim_nadam | R Documentation |
R implementation of the Nadam optimizer proposed by Dazat (2016).
From the abstract by the paper by Dozat (2016): This work aims to improve upon the recently proposed and rapidly popularized optimization algorithm Adam (Kingma & Ba, 2014). Adam has two main components—a momentum component and an adaptive learning rate component. However, regular momentum can be shown conceptually and empirically to be inferior to a similar algorithm known as Nesterov’s accelerated gradient (NAG).
optim_nadam(
params,
lr = 0.002,
betas = c(0.9, 0.999),
eps = 1e-08,
weight_decay = 0,
momentum_decay = 0.004
)
params |
List of parameters to optimize. |
lr |
Learning rate (default: 1e-3) |
betas |
Coefficients computing running averages of gradient and its square (default: (0.9, 0.999)). |
eps |
Term added to the denominator to improve numerical stability (default: 1e-8). |
weight_decay |
Weight decay (L2 penalty) (default: 0). |
momentum_decay |
Momentum_decay (default: 4e-3). |
A torch optimizer object implementing the step
method.
Gilberto Camara, gilberto.camara@inpe.br
Rolf Simoes, rolf.simoes@inpe.br
Felipe Souza, lipecaso@gmail.com
Alber Sanchez, alber.ipia@inpe.br
Timothy Dozat, "Incorporating Nesterov Momentum into Adam", International Conference on Learning Representations (ICLR) 2016. https://openreview.net/pdf/OM0jvwB8jIp57ZJjtNEZ.pdf
if (torch::torch_is_installed()) {
# function to demonstrate optimization
beale <- function(x, y) {
log((1.5 - x + x * y)^2 + (2.25 - x - x * y^2)^2 + (2.625 - x + x * y^3)^2)
}
# define optimizer
optim <- torchopt::optim_nadam
# define hyperparams
opt_hparams <- list(lr = 0.01)
# starting point
x0 <- 3
y0 <- 3
# create tensor
x <- torch::torch_tensor(x0, requires_grad = TRUE)
y <- torch::torch_tensor(y0, requires_grad = TRUE)
# instantiate optimizer
optim <- do.call(optim, c(list(params = list(x, y)), opt_hparams))
# run optimizer
steps <- 400
x_steps <- numeric(steps)
y_steps <- numeric(steps)
for (i in seq_len(steps)) {
x_steps[i] <- as.numeric(x)
y_steps[i] <- as.numeric(y)
optim$zero_grad()
z <- beale(x, y)
z$backward()
optim$step()
}
print(paste0("starting value = ", beale(x0, y0)))
print(paste0("final value = ", beale(x_steps[steps], y_steps[steps])))
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.