armijo: Generic functions to aid finding local minima given search...
In rje42/rje: Miscellaneous Useful Functions for Statistics

armijo

R Documentation

Generic functions to aid finding local minima given search direction

Description

Allows use of an Armijo rule or coarse line search as part of minimisation (or maximisation) of a differentiable function of multiple arguments (via gradient descent or similar). Repeated application of one of these rules should (hopefully) lead to a local minimum.

Usage

armijo(
  fun,
  x,
  dx,
  beta = 3,
  sigma = 0.5,
  grad,
  maximise = FALSE,
  searchup = TRUE,
  adj.start = 1,
  ...
)

coarseLine(fun, x, dx, beta = 3, maximise = FALSE, ...)

Arguments

`fun`	a function whose first argument is a numeric vector
`x`	a starting value to be passed to `fun`
`dx`	numeric vector containing feasible direction for search; defaults to `-grad` for ordinary gradient descent
`beta`	numeric value (greater than 1) giving factor by which to adjust step size
`sigma`	numeric value (less than 1) giving steepness criterion for move
`grad`	numeric gradient of `f` at `x` (will be estimated if not provided)
`maximise`	logical: if set to `TRUE` search is for a maximum rather than a minimum.
`searchup`	logical: if set to `TRUE` method will try to find largest move satisfying Armijo criterion, rather than just accepting the first it sees
`adj.start`	an initial adjustment factor for the step size.
`...`	other arguments to be passed to `fun`

Details

coarseLine performs a stepwise search and tries to find the integer k minimising f(x_k) where

x_k = x + \beta^k dx.

Note k may be negative. This is genearlly quicker and dirtier than the Armijo rule.

armijo implements an Armijo rule for moving, which is to say that

f(x_k) - f(x) < - \sigma \beta^k dx \cdot \nabla_x f.

This has better convergence guarantees than a simple line search, but may be slower in practice. See Bertsekas (1999) for theory underlying the Armijo rule.

Each of these rules should be applied repeatedly to achieve convergence (see example below).

Value

A list comprising

`best`	the value of the function at the final point of evaluation
`adj`	the constant in the step, i.e. `\beta^n`
`move`	the final move; i.e. `\beta^n dx`
`code`	an integer indicating the result of the function; 0 = returned OK, 1 = very small move suggested, may be at minimum already, 2 = failed to find minimum: function evaluated to `NA` or was always larger than `f(x)` (direction might be infeasible), 3 = failed to find minimum: stepsize became too small or large without satisfying rule.

Functions

coarseLine(): Coarse line search

Author(s)

Robin Evans

References

Bertsekas, D.P. Nonlinear programming, 2nd Edition. Athena, 1999.

Examples


# minimisation of simple function of three variables
x = c(0,-1,4)
f = function(x) ((x[1]-3)^2 + sin(x[2])^2 + exp(x[3]) - x[3])

tol = .Machine$double.eps
mv = 1

while (mv > tol) {
  # or replace with coarseLine()
  out = armijo(f, x, sigma=0.1)
  x = out$x
  mv = sum(out$move^2)
}

# correct solution is c(3,0,0) (or c(3,k*pi,0) for any integer k)
x

rje42/rje documentation built on April 3, 2024, 11:08 p.m.