psqn_bfgs: BFGS Implementation Used Internally in the psqn Package
In psqn: Partially Separable Quasi-Newton

psqn_bfgs

R Documentation

BFGS Implementation Used Internally in the psqn Package

Description

The method seems to mainly differ from optim by the line search method. This version uses the interpolation method with a zoom phase using cubic interpolation as described by Nocedal and Wright (2006).

Usage

psqn_bfgs(
  par,
  fn,
  gr,
  rel_eps = 1e-08,
  max_it = 100L,
  c1 = 1e-04,
  c2 = 0.9,
  trace = 0L,
  env = NULL,
  gr_tol = -1,
  abs_eps = -1
)

Arguments

`par`	Initial values for the parameters.
`fn`	Function to evaluate the function to be minimized.
`gr`	Gradient of `fn`. Should return the function value as an attribute called `"value"`.
`rel_eps`	Relative convergence threshold.
`max_it`	Maximum number of iterations.
`c1`	Thresholds for the Wolfe condition.
`c2`	Thresholds for the Wolfe condition.
`trace`	Integer where larger values gives more information during the optimization.
`env`	Environment to evaluate `fn` and `gr` in. `NULL` yields the global environment.
`gr_tol`	Convergence tolerance for the Euclidean norm of the gradient. A negative value yields no check.
`abs_eps`	Absolute convergence threshold. A negative values yields no check.

Value

An object like the object returned by psqn.

References

Nocedal, J. and Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer.

Examples

# declare function and gradient from the example from help(optim)
fn <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
gr <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
     200 *      (x2 - x1 * x1))
}

# we need a different function for the method in this package
gr_psqn <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  out <- c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
            200 *      (x2 - x1 * x1))
  attr(out, "value") <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
  out
}

# we get the same
optim    (c(-1.2, 1), fn, gr, method = "BFGS")
psqn_bfgs(c(-1.2, 1), fn, gr_psqn)

# compare the computation time
system.time(replicate(1000,
                      optim    (c(-1.2, 1), fn, gr, method = "BFGS")))
system.time(replicate(1000,
                      psqn_bfgs(c(-1.2, 1), fn, gr_psqn)))

# we can use an alternative convergence criterion
org <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1e-4)
sqrt(sum(gr_psqn(org$par)^2))

new_res <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1e-4, gr_tol = 1e-8)
sqrt(sum(gr_psqn(new_res$par)^2))

new_res <- psqn_bfgs(c(-1.2, 1), fn, gr_psqn, rel_eps = 1, abs_eps = 1e-2)
new_res$value - org$value # ~ there (but this is not guaranteed)

psqn documentation built on March 18, 2022, 7:50 p.m.