optimizerSGD: Functions to optimize the gradient descent of a cost function
In dnn: Deep Neural Network Tools for Probability and Statistic Models

optimizerSGD

R Documentation

Functions to optimize the gradient descent of a cost function

Different type of optimizer functions such as SGD, Momentum, AdamG and NAG.

  optimizerMomentum(V, dW, W, alpha = 0.63, lr = 1e-4, lambda = 1)

`V`	Momentum V = alphaV - lr(dW + lambdaW); W = W + V. NAG V = alpha(V - lr(dW + lambdaW); W = W + V - lr(dW + lambdaW)
`dW`	derivative of cost with respect to W, can be founde by dW = bwdNN2(dy, cache, model),
`W`	weights for DNN model, optimizerd by W = W + V
`alpha`	Momentum rate 0 < alpha < 1, default is alpah = 0.5.
`lr`	learning rate, default is lr = 0.001.
`lambda`	regulation rate for cost + 0.5lambda\|\|W\|\|, default is lambda = 1.0.

For SGD with momentum, use

V = 0; obj = optimizerMomentum(V, dW, W); V = obj$V; W = obj$W

For SDG with MAG

V = 0; obj = optimizerNAG(V, dW, W); V = obj$V; W = obj$W

return and updated W and other parameters such as V, V1 and V2 that will be used on SGD.

Bingshu E. Chen