Minimization algorithm intended for nonsmooth, nonconvex functions, but also applicable to functions that are smooth, convex or both.

1 2 3 4 5 6 | ```
hanso(fn, gr, x0 = NULL,upper = 1, lower = 0, nvar = 0, nstart = 10,
maxit = 1000, maxitgs=100, normtol = 1e-06, fvalquit = -Inf,
xnormquit = Inf, nvec = 0, prtlevel = 1, strongwolfe = 0,
wolfe1 = 1e-04, wolfe2 = 0.5, quitLSfail = 1,
ngrad = min(100, 2 * nvar, nvar + 10), evaldist = 1e-04,
H0 = diag(nvar), scale = 1, samprad = c(1e-04, 1e-05, 1e-06))
``` |

`fn` |
A function to be minimized. fn(x) takes input as a vector of parameters over which minimization is to take place. fn() returns a scaler. |

`gr` |
A function to return the gradient for fn(x). |

`x0` |
Matrix, with dimension (nvar x nstart), each column represent an initial guess. |

`upper` |
upper bound for the initial values |

`lower` |
lower bound for initial values. |

`nvar` |
Number of parameters that fn() takes. |

`nstart` |
Number of initial guesses. Default is 10. |

`maxit` |
maximum number of iterations for bfgs. |

`maxitgs` |
maximum number of iterations for gradient sampling. Warning!: Gradient Sampling is expensive. |

`normtol` |
termination tolerance on d: smallest vector in convex hull of up to options.ngrad gradients (default: 1e-6) |

`fvalquit` |
quit if f drops below this value. |

`xnormquit` |
quit if norm(x) drops below this. |

`nvec` |
0 for saving bfgs values. |

`prtlevel` |
prints messages if this is 1 |

`strongwolfe` |
if this is 1, strong line search is used, otherwise, weak line search is used. Default is 0. |

`wolfe1` |
wolfe line search parameter 1. |

`wolfe2` |
wolfe line search parameter 2. |

`quitLSfail` |
1 if want to quit when line search fails. 0 otherwise. |

`ngrad` |
number of gradients willing to save and use in solving QP to check optimality tolerance on smallest vector in their convex hull; see also next two options |

`evaldist` |
the gradients used in the termination test qualify only if they are evaluated at points approximately within distance options.evaldist of x |

`H0` |
Initial inverse hessian approximation. Default is NULL |

`scale` |
1 to scale H0 at first iteration, 0 otherwise. |

`samprad` |
vector of sampling radii. For gradient sampling. |

It is a two phase process, BFGS phase works for smooth functions, gradient sampling phase works for non smooth ones. Gradient sampling uses the minimum point found by BFGS as its starting point.

BFGS phase: BFGS is run from multiple starting points, taken from the columns of x0, if provided, and otherwise 10 points generated randomly. If the termination test was satisfied at the best point found by BFGS, HANSO terminates; otherwise, it continues to:

Gradient sampling phases: 3 gradient sampling phases are run from lowest point found, using sampling radii: 10*evaldist, evaldist, evaldist/10

Returns a list containing the following fields:

`x` |
a matrix with k'th column containing final value of x obtained from k'th column of x0. |

`f` |
a vector of final obtained minimum values of fn() at the initial points. |

`loc` |
local optimality certificate, list with 2 fields: dnorm: norm of a vector in the convex hull of gradients of the function evaluated at and near x evaldist: specifies max distance from x at which these gradients were evaluated. The smaller loc$dnorm and loc$evaldist are, the more likely it is that x is an approximate local minimizer. |

`H` |
final BFGS inverse hessian approximation |

`X` |
iterates, where saved gradients were evaluated. |

`G` |
saved gradients used for computation. |

`w` |
weights giving the smallest vector. |

Copyright (c) 2010 Michael Overton for Matlab code and documentation, with permission converted to R by Abhirup Mallik (and Hans W Borchers). malli066@umn.edu

A.S. Lewis and M.L. Overton, Nonsmooth Optimization via BFGS, 2008.

J.V. Burke, A.S. Lewis and M.L. Overton, A Robust Gradient Sampling
Algorithm for Nonsmooth, Nonconvex Optimization
SIAM J. Optimization 15 (2005), pp. 751-779

`shor`

, `gradsamp`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | ```
## Nesterov function in 2 dimensions
nesterov_f <- function(x) {
x1 <- x[1]; x2 <- x[2]
1/4*(x1-1)^2 + abs(x2 - 2*x1^2+1)
}
nesterov_g <- function(x) { # analytical gradient
g <- matrix(NA, 2, 1)
x1 <- x[1]; x2 <- x[2]
sgn <- sign(x2 - 2*x1^2 + 1)
if (sgn != 0) {
g[1] <- 0.5*(x1-1) - sgn*4*x1
g[2] <- sgn
}
g
}
hanso(nesterov_f, nesterov_g, nvar = 2)
# Hanso: Best value found by BFGS = 0.1401677
# gradsamp: not descent direction, quit at iter= 93
# gradsamp: not descent direction, quit at iter= 9
# gradsamp: not descent direction, quit at iter= 1
# Best value found by Gradient Sampling = 1.065261e-09
# $x
# [,1]
# [1,] 0.9999712
# [2,] 0.9998849
#
# $f
# [1] 1.065261e-09
#
# ...
## Not run:
## Shor's piecewise quadratic function
hanso(shor_f, shor_g, nvar=5)
# Hanso: Best value found by BFGS = 24.99301
# gradsamp: not descent direction, quit at iter= 100
# gradsamp: not descent direction, quit at iter= 38
# gradsamp: not descent direction, quit at iter= 7
# Best value found by Gradient Sampling = 22.60016
# $x
# [,1]
# [1,] 1.1243379
# [2,] 0.9794677
# [3,] 1.4777124
# [4,] 0.9202101
# [5,] 1.1242998
#
# $f
# [1] 22.60016
#
# ...
## End(Not run)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.