View source: R/function_FISTA.R
descent_generalized_fista_cxx | R Documentation |
descent_generalized_fista takes as input a vector, a gradient function
and an evaluation function (and some additional parameters/functions).
Then, it iteratively minimizes the tweak vector via FISTA
(Beck and Teboulle 2009).
Basically,
the following equations are used:
# prelimary initialization step
for k = 2,..., maxit:
y_vec = NORM.FUN(y_vec)
grad = F.GRAD.FUN(y_vec)
# find best step size between 0 and learning.rate
for step.size = 0,..., learning.rate:
u = ST.FUN(y_vec - step.size * grad, step.size * lambda)
eval = EVAL.FUN(u)
# only keep u with minimal eval
tweak.old = tweak.vec
# if it was a descent step:
tweak.vec = u
# if it was not a descent step:
tweak.vec = tweak.vec
#(and restart as suggested in O’Donoghue & Candes (2012))
# Nesterov extrapolation:
nesterov.direction = tweak.vec - tweak.old
# find best extrapolation size bewteen 0 and FACTOR.FUN(k):
for ne = 0 ,... FACTOR.FUN(k):
y_vec = u_vec + ne * nesterov.direction
eval = EVAL.FUN(y_vec)
# only keep y_vec with minimal eval
stop criterion: if the descent in either the gradient or the nesterov step was below a critical value, then stop.
descent_generalized_fista_cxx( model, lambda = 0.01, maxit = 500, stop.crit.threshold = 1e-13, save.all.tweaks = FALSE, learning.rate = NA, line.search.speed = 2, cycles = 5, use.restart = TRUE, verbose = FALSE, ... )
model |
a model as constructed in interface_cxx.R |
lambda |
non-negative float, regularization factor for ST.FUN function. |
maxit |
integer, maximum number of iterations for the iterative minimization. |
stop.crit.threshold |
numeric value. The change in either the gradient or nesterov step has to be at least 'stop.crit.threshold', or the algorithm will stop. |
save.all.tweaks |
logical, should all tweak vectores during all iterations be stored. |
learning.rate |
float, step size during optimization. If it is NA, the learning rate will be estimated as published by Barzilai & Borwein 1988. Notice, the algorithm adjusts the learning rate during optimization. |
line.search.speed |
numeric, factor with which the learning rate changes during optimization. If 'line.search.speed' is, e.g., 2, the learning rate gets doubled if the highest 'cycle' led to the best eval score. If the 'cycle = 0' led to the best eval score, it would get halved. |
cycles |
integer, in each iteration one gradient is calculated. To find the best step size, we do "cycles" steps, and evaluate each of them to find the best step size. |
use.restart |
logical, restart the algorithm if the update was not a descent step. |
verbose |
logical, if set to true, will output information during iteration. |
a list that contains the trained model and its History
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.