BBoptim: Large=Scale Nonlinear Optimization - A Wrapper for spg() In BB: Solving and Optimizing Large-Scale Nonlinear Systems

Description

A strategy using different Barzilai-Borwein steplengths to optimize a nonlinear objective function subject to box constraints.

Usage

 ```1 2 3 4``` ``` BBoptim(par, fn, gr=NULL, method=c(2,3,1), lower=-Inf, upper=Inf, project=NULL, projectArgs=NULL, control=list(), quiet=FALSE, ...) ```

Arguments

 `par` A real vector argument to `fn`, indicating the initial guess for the root of the nonliinear system of equations `fn`. `fn` Nonlinear objective function that is to be optimized. A scalar function that takes a real vector as argument and returns a scalar that is the value of the function at that point (see details). `gr` The gradient of the objective function `fn` evaluated at the argument. This is a vector-function that takes a real vector as argument and returns a real vector of the same length. It defaults to `NULL`, which means that gradient is evaluated numerically. Computations are dramatically faster in high-dimensional problems when the exact gradient is provided. See *Example*. `method` A vector of integers specifying which Barzilai-Borwein steplengths should be used in a consecutive manner. The methods will be used in the order specified. `upper` An upper bound for box constraints. See `spg` `lower` An lower bound for box constraints. See `spg` `project` The projection function that takes a point in \$R^n\$ and projects it onto a region that defines the constraints of the problem. This is a vector-function that takes a real vector as argument and returns a real vector of the same length. See `spg` for more details. `projectArgs` list of arguments to `project`. See `spg()` for more details. `control` A list of parameters governing the algorithm behaviour. This list is the same as that for `spg` (excepting the default for `trace`). See `details` for important special features of control parameters. `quiet` logical indicating if messages about convergence success or failure should be suppressed `...` arguments passed fn (via the optimization algorithm).

Details

This wrapper is especially useful in problems where (`spg` is likely to experience convergence difficulties. When `spg()` fails, i.e. when `convergence > 0` is obtained, a user might attempt various strategies to find a local optimizer. The function `BBoptim` tries the following sequential strategy:

1. Try a different BB steplength. Since the default is `method = 2` for `dfsane`, BBoptim wrapper tries `method = c(2, 3, 1)`.

2. Try a different non-monotonicity parameter `M` for each method, i.e. BBoptim wrapper tries `M = c(50, 10)` for each BB steplength.

The argument `control` defaults to a list with values ```maxit = 1500, M = c(50, 10), ftol=1.e-10, gtol = 1e-05, maxfeval = 10000, maximize = FALSE, trace = FALSE, triter = 10, eps = 1e-07, checkGrad=NULL```. It is recommended that `checkGrad` be set to FALSE for high-dimensional problems, after making sure that the gradient is correctly specified. See `spg` for additional details about the default.

If `control` is specified as an argument, only values which are different need to be given in the list. See `spg` for more details.

Value

A list with the same elements as returned by `spg`. One additional element returned is `cpar` which contains the control parameter settings used to obtain successful convergence, or to obtain the best solution in case of failure.

References

R Varadhan and PD Gilbert (2009), BB: An R Package for Solving a Large System of Nonlinear Equations and for Optimizing a High-Dimensional Nonlinear Objective Function, J. Statistical Software, 32:4, http://www.jstatsoft.org/v32/i04/

See Also

`BBsolve`, `spg`, `multiStart` `optim` `grad`

Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22``` ```# Use a preset seed so test values are reproducable. require("setRNG") old.seed <- setRNG(list(kind="Mersenne-Twister", normal.kind="Inversion", seed=1234)) rosbkext <- function(x){ # Extended Rosenbrock function n <- length(x) j <- 2 * (1:(n/2)) jm1 <- j - 1 sum(100 * (x[j] - x[jm1]^2)^2 + (1 - x[jm1])^2) } p0 <- rnorm(50) spg(par=p0, fn=rosbkext) BBoptim(par=p0, fn=rosbkext) # compare the improvement in convergence when bounds are specified BBoptim(par=p0, fn=rosbkext, lower=0) # identical to spg() with defaults BBoptim(par=p0, fn=rosbkext, method=3, control=list(M=10, trace=TRUE)) ```

Example output

```Loading required package: setRNG
iter:  0  f-value:  10824.77  pgrad:  5320.046
iter:  10  f-value:  39.77528  pgrad:  27.99759
iter:  20  f-value:  30.22168  pgrad:  10.38169
iter:  30  f-value:  26.85768  pgrad:  40.95957
iter:  40  f-value:  24.2456  pgrad:  2.925454
iter:  50  f-value:  18.21567  pgrad:  3.079435
iter:  60  f-value:  14.91738  pgrad:  2.334929
iter:  70  f-value:  13.44534  pgrad:  11.99571
iter:  80  f-value:  11.40815  pgrad:  1.799464
iter:  90  f-value:  9.712975  pgrad:  4.29862
iter:  100  f-value:  8.392164  pgrad:  11.32991
iter:  110  f-value:  7.311477  pgrad:  1.565696
iter:  120  f-value:  6.639707  pgrad:  3.421991
iter:  130  f-value:  6.193688  pgrad:  22.20667
iter:  140  f-value:  4.66529  pgrad:  1.574616
iter:  150  f-value:  3.937451  pgrad:  8.75098
iter:  160  f-value:  3.153241  pgrad:  6.673114
iter:  170  f-value:  2.208422  pgrad:  1.53933
iter:  180  f-value:  1.669985  pgrad:  0.8507576
iter:  190  f-value:  1.449601  pgrad:  1.953953
iter:  200  f-value:  1.371948  pgrad:  0.6711463
iter:  210  f-value:  1.114267  pgrad:  1.776885
iter:  220  f-value:  0.9583621  pgrad:  1.568243
iter:  230  f-value:  0.8269146  pgrad:  3.633741
iter:  240  f-value:  0.6304826  pgrad:  0.3339276
iter:  250  f-value:  0.5107885  pgrad:  0.2684569
iter:  260  f-value:  0.4613847  pgrad:  0.2476938
iter:  270  f-value:  0.3270633  pgrad:  11.01763
iter:  280  f-value:  0.1742687  pgrad:  0.444439
iter:  290  f-value:  0.1625566  pgrad:  0.1235645
iter:  300  f-value:  0.101921  pgrad:  1.459339
iter:  310  f-value:  0.09256554  pgrad:  0.08955312
iter:  320  f-value:  0.04521948  pgrad:  0.05963155
iter:  330  f-value:  0.02634377  pgrad:  0.04566414
iter:  340  f-value:  0.01193873  pgrad:  0.0904557
iter:  350  f-value:  0.008038826  pgrad:  0.1842721
iter:  360  f-value:  0.01248124  pgrad:  3.643765
iter:  370  f-value:  0.001487418  pgrad:  0.1724186
iter:  380  f-value:  0.0009728912  pgrad:  0.008121104
iter:  390  f-value:  0.0001587139  pgrad:  0.003207078
iter:  400  f-value:  1.091916e-07  pgrad:  4.596227e-05
\$par
[1] 0.9999292 0.9998582 0.9999353 0.9998704 0.9999426 0.9998850 0.9999344
[8] 0.9998686 0.9999350 0.9998698 0.9999351 0.9998700 0.9999316 0.9998630
[15] 0.9999404 0.9998806 0.9999350 0.9998699 0.9999125 0.9998247 0.9999356
[22] 0.9998711 0.9999310 0.9998619 0.9999353 0.9998704 0.9999356 0.9998710
[29] 0.9999353 0.9998705 0.9999378 0.9998755 0.9999341 0.9998680 0.9999349
[36] 0.9998695 0.9999350 0.9998697 0.9999347 0.9998693 0.9999360 0.9998718
[43] 0.9999330 0.9998659 0.9999348 0.9998694 0.9999351 0.9998700 0.9999344
[50] 0.9998686

\$value
[1] 1.091439e-07

\$gradient
[1] 4.597844e-05

\$fn.reduction
[1] 10824.77

\$iter
[1] 401

\$feval
[1] 493

\$convergence
[1] 0

\$message
[1] "Successful convergence"

iter:  0  f-value:  10824.77  pgrad:  5320.046
iter:  10  f-value:  54.40324  pgrad:  52.26724
iter:  20  f-value:  33.13841  pgrad:  2.110108
iter:  30  f-value:  26.40889  pgrad:  2.109646
iter:  40  f-value:  18.79585  pgrad:  2.063969
iter:  50  f-value:  16.09957  pgrad:  2.000223
iter:  60  f-value:  13.9462  pgrad:  2.576766
iter:  70  f-value:  12.70453  pgrad:  2.179719
iter:  80  f-value:  10.25791  pgrad:  2.176441
iter:  90  f-value:  8.034416  pgrad:  12.35637
iter:  100  f-value:  7.331731  pgrad:  2.065253
iter:  110  f-value:  6.600971  pgrad:  1.61036
iter:  120  f-value:  4.842992  pgrad:  1.517246
iter:  130  f-value:  4.273633  pgrad:  1.502324
iter:  140  f-value:  3.519309  pgrad:  17.5416
iter:  150  f-value:  2.483051  pgrad:  2.053522
iter:  160  f-value:  3.74858  pgrad:  35.27227
iter:  170  f-value:  1.095609  pgrad:  0.8585015
iter:  180  f-value:  1.002489  pgrad:  0.4675539
iter:  190  f-value:  0.9069942  pgrad:  0.4363785
iter:  200  f-value:  0.4666699  pgrad:  1.127691
iter:  210  f-value:  0.3889428  pgrad:  0.2377373
iter:  220  f-value:  0.07270047  pgrad:  2.09311
iter:  230  f-value:  0.01169285  pgrad:  0.0282269
iter:  240  f-value:  0.01127181  pgrad:  1.139276
Successful convergence.
\$par
[1] 0.9999683 0.9999366 0.9999685 0.9999369 0.9999687 0.9999373 0.9999684
[8] 0.9999368 0.9999685 0.9999369 0.9999685 0.9999369 0.9999684 0.9999367
[15] 0.9999686 0.9999372 0.9999685 0.9999369 0.9999682 0.9999363 0.9999685
[22] 0.9999369 0.9999684 0.9999367 0.9999685 0.9999369 0.9999685 0.9999369
[29] 0.9999685 0.9999369 0.9999685 0.9999370 0.9999684 0.9999368 0.9999685
[36] 0.9999369 0.9999685 0.9999369 0.9999685 0.9999369 0.9999685 0.9999369
[43] 0.9999684 0.9999368 0.9999685 0.9999369 0.9999685 0.9999369 0.9999684
[50] 0.9999368

\$value
[1] 2.488589e-08

\$gradient
[1] 1.617719e-06

\$fn.reduction
[1] 10824.77

\$iter
[1] 242

\$feval
[1] 345

\$convergence
[1] 0

\$message
[1] "Successful convergence"

\$cpar
method      M
2     50

iter:  0  f-value:  10824.77  pgrad:  93589969068
iter:  10  f-value:  80.17689  pgrad:  89.45276
iter:  20  f-value:  18.6482  pgrad:  2.331853
iter:  30  f-value:  16.57026  pgrad:  14.70272
iter:  40  f-value:  14.53976  pgrad:  1.620273
iter:  50  f-value:  13.00756  pgrad:  14.99384
iter:  60  f-value:  12.46553  pgrad:  1.191981
iter:  70  f-value:  11.5946  pgrad:  1.075108
iter:  80  f-value:  11.05444  pgrad:  1.007026
iter:  90  f-value:  10.2394  pgrad:  0.9117447
iter:  100  f-value:  9.546816  pgrad:  0.8072167
iter:  110  f-value:  9.009637  pgrad:  0.7865717
iter:  120  f-value:  8.107801  pgrad:  1.592373
iter:  130  f-value:  7.803584  pgrad:  0.6117142
iter:  140  f-value:  7.656692  pgrad:  0.5939561
iter:  150  f-value:  4.065993  pgrad:  0.4956877
iter:  160  f-value:  3.959986  pgrad:  0.8255063
iter:  170  f-value:  3.887696  pgrad:  0.4946903
iter:  180  f-value:  3.701328  pgrad:  2.09004
iter:  190  f-value:  3.632759  pgrad:  0.4825317
iter:  200  f-value:  3.098259  pgrad:  1.007279
iter:  210  f-value:  3.039008  pgrad:  1.593319
iter:  220  f-value:  3.018869  pgrad:  0.488434
iter:  230  f-value:  2.667001  pgrad:  4.129558
iter:  240  f-value:  2.554116  pgrad:  0.4780626
iter:  250  f-value:  2.207891  pgrad:  1.591027
iter:  260  f-value:  1.998948  pgrad:  0.5785394
iter:  270  f-value:  1.96601  pgrad:  0.4639743
iter:  280  f-value:  1.934402  pgrad:  0.462478
iter:  290  f-value:  1.849995  pgrad:  0.5029136
iter:  300  f-value:  1.834994  pgrad:  0.4596853
iter:  310  f-value:  1.814697  pgrad:  0.4329975
iter:  320  f-value:  1.100727  pgrad:  0.6330166
iter:  330  f-value:  1.091672  pgrad:  0.4246225
iter:  340  f-value:  1.034675  pgrad:  0.4189956
iter:  350  f-value:  1.017113  pgrad:  0.4180038
iter:  360  f-value:  1.00267  pgrad:  0.4166663
iter:  370  f-value:  0.8821494  pgrad:  7.53588
iter:  380  f-value:  0.7199736  pgrad:  0.3883554
iter:  390  f-value:  0.5232619  pgrad:  0.3573914
iter:  400  f-value:  0.5201436  pgrad:  0.355807
iter:  410  f-value:  0.4125356  pgrad:  0.3006682
iter:  420  f-value:  0.4067102  pgrad:  0.3310747
iter:  430  f-value:  0.4018838  pgrad:  0.3295781
iter:  440  f-value:  0.3468103  pgrad:  0.3148879
iter:  450  f-value:  0.2720018  pgrad:  0.2911181
iter:  460  f-value:  0.2599182  pgrad:  0.2865981
iter:  470  f-value:  0.258163  pgrad:  0.2861937
iter:  480  f-value:  0.0674882  pgrad:  0.3424147
iter:  490  f-value:  0.002635549  pgrad:  0.0397948
iter:  500  f-value:  0.0003608076  pgrad:  0.01779499
iter:  510  f-value:  2.282022e-08  pgrad:  0.0006876697
Successful convergence.
\$par
[1] 0.9999699 0.9999399 0.9999700 0.9999399 0.9999700 0.9999399 0.9999699
[8] 0.9999399 0.9999699 0.9999399 0.9999699 0.9999399 0.9999699 0.9999399
[15] 0.9999700 0.9999399 0.9999699 0.9999399 0.9999700 0.9999399 0.9999699
[22] 0.9999399 0.9999699 0.9999399 0.9999699 0.9999399 0.9999700 0.9999399
[29] 0.9999699 0.9999399 0.9999700 0.9999399 0.9999699 0.9999399 0.9999699
[36] 0.9999399 0.9999699 0.9999399 0.9999699 0.9999399 0.9999698 0.9999395
[43] 0.9999699 0.9999399 0.9999699 0.9999399 0.9999699 0.9999399 0.9999699
[50] 0.9999399

\$value
[1] 2.259115e-08

\$gradient
[1] 1.583901e-07

\$fn.reduction
[1] 10824.77

\$iter
[1] 513

\$feval
[1] 638

\$convergence
[1] 0

\$message
[1] "Successful convergence"

\$cpar
method      M
2     50

iter:  0  f-value:  10824.77  pgrad:  5320.046
iter:  10  f-value:  39.77528  pgrad:  27.99759
iter:  20  f-value:  30.22168  pgrad:  10.38169
iter:  30  f-value:  26.85768  pgrad:  40.95957
iter:  40  f-value:  24.2456  pgrad:  2.925454
iter:  50  f-value:  18.21567  pgrad:  3.079435
iter:  60  f-value:  14.91738  pgrad:  2.334929
iter:  70  f-value:  13.44534  pgrad:  11.99571
iter:  80  f-value:  11.40815  pgrad:  1.799464
iter:  90  f-value:  9.712975  pgrad:  4.29862
iter:  100  f-value:  8.392164  pgrad:  11.32991
iter:  110  f-value:  7.311477  pgrad:  1.565696
iter:  120  f-value:  6.639707  pgrad:  3.421991
iter:  130  f-value:  6.193688  pgrad:  22.20667
iter:  140  f-value:  4.66529  pgrad:  1.574616
iter:  150  f-value:  3.937451  pgrad:  8.75098
iter:  160  f-value:  3.153241  pgrad:  6.673114
iter:  170  f-value:  2.208422  pgrad:  1.53933
iter:  180  f-value:  1.669985  pgrad:  0.8507576
iter:  190  f-value:  1.449601  pgrad:  1.953953
iter:  200  f-value:  1.371948  pgrad:  0.6711463
iter:  210  f-value:  1.114267  pgrad:  1.776885
iter:  220  f-value:  0.9583621  pgrad:  1.568243
iter:  230  f-value:  0.8269146  pgrad:  3.633741
iter:  240  f-value:  0.6304826  pgrad:  0.3339276
iter:  250  f-value:  0.5107885  pgrad:  0.2684569
iter:  260  f-value:  0.4613847  pgrad:  0.2476938
iter:  270  f-value:  0.3270633  pgrad:  11.01763
iter:  280  f-value:  0.1742687  pgrad:  0.444439
iter:  290  f-value:  0.1625566  pgrad:  0.1235645
iter:  300  f-value:  0.101921  pgrad:  1.459339
iter:  310  f-value:  0.09256554  pgrad:  0.08955312
iter:  320  f-value:  0.04521948  pgrad:  0.05963155
iter:  330  f-value:  0.02634377  pgrad:  0.04566414
iter:  340  f-value:  0.01193873  pgrad:  0.0904557
iter:  350  f-value:  0.008038826  pgrad:  0.1842721
iter:  360  f-value:  0.01248124  pgrad:  3.643765
iter:  370  f-value:  0.001487418  pgrad:  0.1724186
iter:  380  f-value:  0.0009728912  pgrad:  0.008121104
iter:  390  f-value:  0.0001587139  pgrad:  0.003207078
iter:  400  f-value:  1.091916e-07  pgrad:  4.596227e-05
Successful convergence.
\$par
[1] 0.9999292 0.9998582 0.9999353 0.9998704 0.9999426 0.9998850 0.9999344
[8] 0.9998686 0.9999350 0.9998698 0.9999351 0.9998700 0.9999316 0.9998630
[15] 0.9999404 0.9998806 0.9999350 0.9998699 0.9999125 0.9998247 0.9999356
[22] 0.9998711 0.9999310 0.9998619 0.9999353 0.9998704 0.9999356 0.9998710
[29] 0.9999353 0.9998705 0.9999378 0.9998755 0.9999341 0.9998680 0.9999349
[36] 0.9998695 0.9999350 0.9998697 0.9999347 0.9998693 0.9999360 0.9998718
[43] 0.9999330 0.9998659 0.9999348 0.9998694 0.9999351 0.9998700 0.9999344
[50] 0.9998686

\$value
[1] 1.091439e-07

\$gradient
[1] 4.597844e-05

\$fn.reduction
[1] 10824.77

\$iter
[1] 401

\$feval
[1] 493

\$convergence
[1] 0

\$message
[1] "Successful convergence"

\$cpar
method      M
3     10
```

