armijo: Generic functions to aid finding local minima given search...

armijoR Documentation

Generic functions to aid finding local minima given search direction

Description

Allows use of an Armijo rule or coarse line search as part of minimisation (or maximisation) of a differentiable function of multiple arguments (via gradient descent or similar). Repeated application of one of these rules should (hopefully) lead to a local minimum.

Usage

armijo(
  fun,
  x,
  dx,
  beta = 3,
  sigma = 0.5,
  grad,
  maximise = FALSE,
  searchup = TRUE,
  adj.start = 1,
  ...
)

coarseLine(fun, x, dx, beta = 3, maximise = FALSE, ...)

Arguments

fun

a function whose first argument is a numeric vector

x

a starting value to be passed to fun

dx

numeric vector containing feasible direction for search; defaults to -grad for ordinary gradient descent

beta

numeric value (greater than 1) giving factor by which to adjust step size

sigma

numeric value (less than 1) giving steepness criterion for move

grad

numeric gradient of f at x (will be estimated if not provided)

maximise

logical: if set to TRUE search is for a maximum rather than a minimum.

searchup

logical: if set to TRUE method will try to find largest move satisfying Armijo criterion, rather than just accepting the first it sees

adj.start

an initial adjustment factor for the step size.

...

other arguments to be passed to fun

Details

coarseLine performs a stepwise search and tries to find the integer k minimising f(x_k) where

x_k = x + β^k dx.

Note k may be negative. This is genearlly quicker and dirtier than the Armijo rule.

armijo implements an Armijo rule for moving, which is to say that

f(x_k) - f(x) < - σ β^k dx . grad.

This has better convergence guarantees than a simple line search, but may be slower in practice. See Bertsekas (1999) for theory underlying the Armijo rule.

Each of these rules should be applied repeatedly to achieve convergence (see example below).

Value

A list comprising

best

the value of the function at the final point of evaluation

adj

the constant in the step, i.e. β^n

move

the final move; i.e. β^n dx

code

an integer indicating the result of the function; 0 = returned OK, 1 = very small move suggested, may be at minimum already, 2 = failed to find minimum: function evaluated to NA or was always larger than f(x) (direction might be infeasible), 3 = failed to find minimum: stepsize became too small or large without satisfying rule.

Functions

  • coarseLine: Coarse line search

Author(s)

Robin Evans

References

Bertsekas, D.P. Nonlinear programming, 2nd Edition. Athena, 1999.

Examples


# minimisation of simple function of three variables
x = c(0,-1,4)
f = function(x) ((x[1]-3)^2 + sin(x[2])^2 + exp(x[3]) - x[3])

tol = .Machine$double.eps
mv = 1

while (mv > tol) {
  # or replace with coarseLine()
  out = armijo(f, x, sigma=0.1)
  x = out$x
  mv = sum(out$move^2)
}

# correct solution is c(3,0,0) (or c(3,k*pi,0) for any integer k)
x



rje documentation built on Nov. 12, 2022, 9:06 a.m.