View source: R/sjSDM_configs.R
| madgrad | R Documentation |
stochastic gradient descent optimizer
madgrad(momentum = 0.9, weight_decay = 0, eps = 1e-06)
momentum |
strength of momentum |
weight_decay |
l2 penalty on weights |
eps |
epsilon |
Anonymous function that returns optimizer when called.
Defazio, A., & Jelassi, S. (2021). Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization. arXiv preprint arXiv:2101.11075.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.