psqn_bfgs: BFGS Implementation Used Internally in the psqn Package

psqn_bfgsR Documentation

BFGS Implementation Used Internally in the psqn Package

Description

The method seems to mainly differ from optim by the line search method. This version uses the interpolation method with a zoom phase using cubic interpolation as described by Nocedal and Wright (2006).

Usage

psqn_bfgs(
  par,
  fn,
  gr,
  rel_eps = 1e-08,
  max_it = 100L,
  c1 = 1e-04,
  c2 = 0.9,
  trace = 0L,
  env = NULL,
  gr_tol = -1,
  abs_eps = -1
)

Arguments

par

Initial values for the parameters.

fn

Function to evaluate the function to be minimized.

gr

Gradient of fn. Should return the function value as an attribute called "value".

rel_eps

Relative convergence threshold.

max_it

Maximum number of iterations.

c1

Thresholds for the Wolfe condition.

c2

Thresholds for the Wolfe condition.

trace

Integer where larger values gives more information during the optimization.

env

Environment to evaluate fn and gr in. NULL yields the global environment.

gr_tol

Convergence tolerance for the Euclidean norm of the gradient. A negative value yields no check.

abs_eps

Absolute convergence threshold. A negative values yields no check.

Value

An object like the object returned by psqn.

References

Nocedal, J. and Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer.

Examples

# declare function and gradient from the example from help(optim)
fn <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
gr <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
     200 *      (x2 - x1 * x1))
}

# we need a different function for the method in this package
gr_psqn <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  out <- c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
            200 *      (x2 - x1 * x1))
  attr(out, "value") <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
  out
}

# we get the same
optim    (c(-1.2, 1), fn, gr, method = "BFGS")
psqn_bfgs(c(-1.2, 1), fn, gr_psqn)

# compare the computation time
system.time(replicate(1000,
                      optim    (c(-1.2, 1), fn, gr, method = "BFGS")))
system.time(replicate(1000,
                      psqn_bfgs(c(-1.2, 1), fn, gr_psqn)))

# we can use an alternative convergence criterion
org <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1e-4)
sqrt(sum(gr_psqn(org$par)^2))

new_res <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1e-4, gr_tol = 1e-8)
sqrt(sum(gr_psqn(new_res$par)^2))

new_res <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1, abs_eps = 1e-2)
new_res$value - org$value # ~ there (but this is not guaranteed)

psqn documentation built on March 18, 2022, 7:50 p.m.