Another updater with adaptive step sizes, like adagrad and adadelta.
https://climin.readthedocs.org/en/latest/rmsprop.html
learning.rate
the learning rate (set to one in the original paper)
squared.grad
a matrix summing the squared gradients over previous updates (decays according to gamma)
decay
how quickly should squared gradients decay?
delta
the delta matrix (see updater
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.