optimx: General-purpose optimization
In optplus: A wrapper for optimization methods (function minimization with at most bounds and masks).

Description Usage Arguments Details Value Note Source References See Also Examples

General-purpose optimization wrapper function that calls other R tools for optimization, including the existing optim() function. optimx also tries to unify the calling sequence to allow a number of tools to use the same front-end. These include

spg from the BB package,
ucminf,
nlm,
nlminb,
Rvmmin,
Rcgmin,
nmk, nmkb, hjk, and hjkb from the dfoptim package,
uobyqa, newuoa and bobyqa from the minqa package.

Note that optim() itself allows Nelder–Mead, quasi-Newton and conjugate-gradient algorithms as well as box-constrained optimization via L-BFGS-B. Because SANN does not return a meaningful convergence code (conv), optimx() does not call the SANN method.

optimx(par, fn, gr=NULL, hess=NULL, lower=-Inf, upper=Inf, bdmsk=NULL, 
            method=c("Nelder-Mead","BFGS"), itnmax=NULL, hessian=FALSE,
            control=list(),
             ...)

`par`	A vector of initial values for the parameters for which optimal values are to be found.
`fn`	A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.
`gr`	A function to return (as a vector) the gradient for those methods that can use this information. This includes the following methods among others: `"BFGS"` `"CG"` `"L-BFGS-B"` `"ucminf"` `"Rvmmin"` `"Rcgmin"` `"nlm"` `"nlminb"` `"spg"` If 'gr' is `NULL`, a finite-difference approximation will be used. An open question concerns whether the SAME approximation code used for all methods, or whether there are differences that could/should be examined. As at 2012-8-3, `nlm` and `nlminb` use their own approximations when `gr=NULL`. Otherwise, a forward gradient approximation is used. Gradient approximation routines can be specified by enclosing them in quotes. Thus `gr="grfwd"` calls the forward gradient approximation. Also available in the `optextras` are `"grcentral"`, `"grback"` and `"grnd"` (which uses `numDeriv`).
`hess`	A function to return (as a symmetric matrix) the Hessian of the objective function for those methods that can use this information.
`lower, upper`	Bounds on the variables for methods such as `"L-BFGS-B"` that can handle box (or bounds) constraints.
`bdmsk`	Vector of indicators for bounds and masks. Masked are specified on input by entries in `bdmsk` that are 0, while free parameters have 1. On termination there may be values -3 for active lower bounds, and -1 for active upper bounds. However, at 2011-6-28 the `bdmsk` vector is NOT returned. (It is returned by packages `Rvmmin`. Note that only Rcgmin and Rvmmin handle masks at 2012-5-20, so these methods are both attempted if masks are specified and another method is suggested via the `method` argument.
`method`	A list of the methods to be used. Note that this is an important change from optim() that allows just one method to be specified. See ‘Details’.
`itnmax`	If provided as a vector of the same length as the list of methods `method`, gives the maximum number of iterations or function values for the corresponding method. If a single number is provided, this will be used for all methods. Note that there may be control list elements with similar functions, but this should be the preferred approach when using `optimx`.
`hessian`	A logical control that if TRUE forces the computation of an approximation to the Hessian at the final set of parameters. If FALSE (default), the hessian is calculated if needed to provide the KKT optimality tests (see `kkt` in ‘Details’ for the `control` list). This setting is provided primarily for compatibility with optim(). Note that there are extended possibilities in `ucminf`, where `hessian` is treated as an integer, but these, as well as the facility in `optim` are ignored, and the numerical hessian from the `kktc` package are used.
`control`	A list of control parameters. See ‘Details’.
`...`	Further arguments to be passed to `fn` and `gr`.

Note that arguments after ... must be matched exactly.

By default this function performs minimization, but it will maximize if control$maximize is TRUE. The original optim() function allows control$fnscale to be set negative to accomplish this. DO NOT use both methods.

Possible method codes at the time of writing are 'Nelder-Mead', 'BFGS', 'CG', 'L-BFGS-B', 'nlm', 'nlminb', 'spg', 'ucminf', 'Rcgmin', 'Rvmmin', 'bobyqa', 'uobyqa', 'newuoa', 'nmkb', and 'hjkb'.

(For convenience, 'NM' may be used instead of 'Nelder-Mead' and that is the display name to minimize display width. Similarly, 'LBFGSB' may be used instead of 'L-BFGS-B'.)

The default methods for unconstrained problems (no lower or upper specified) are 'Nelder-Mead', an implementation of the Nelder and Mead (1965) and 'BFGS', a Variable Metric method based on the ideas of Fletcher (1970) as modified by him in conversation with J.C. Nash (see Nash, 1979). Nelder-Mead uses only function values and is robust but relatively slow. It will work reasonably well for non-differentiable functions. The Variable Metric method, "BFGS" updates an approximation to the inverse Hessian using the BFGS update formulas, along with an acceptable point line search strategy. This method appears to work best with analytic gradients. ("Rvmmmin" provides a box-constrained version of this algorithm, but there are some differences in detail so that the trajectories of the minimization of an unconstrained problem will not match exactly.

If no method is given, and there are bounds constraints provided, the method is set to "L-BFGS-B".

Method "CG" is a conjugate gradients method based on that by Fletcher and Reeves (1964) (but with the option of Polak–Ribiere or Beale–Sorenson updates). The particular implementation is now dated, and improved yet simpler codes exist, and one of them is provided via Rcgmin. Furthermore this permits box constraints on parameters. Conjugate gradient methods will generally be more fragile than the BFGS method, but as they do not store a matrix they may be successful in much larger optimization problems.

Method "L-BFGS-B" is that of Byrd et. al. (1995) which allows box constraints, that is each variable can be given a lower and/or upper bound. The initial value must satisfy the constraints. This uses a limited-memory modification of the BFGS quasi-Newton method. If non-trivial bounds are supplied, this method will be selected, with a warning.

Nocedal and Wright (1999) is a comprehensive reference for the previous three methods.

The user objective function (fn) can commonly return NA or Inf if the function cannot be evaluated at the supplied parameters. optimx requires that the initial parameters must give a computable finite value of fn. However, some methods, of which "L-BFGS-B" is known to be a case, require that the values returned should always be finite. To control for excursions outside the domain where function fn and, if provided, gr and hess, can be computed, optimx wraps these function using ufn, ugr and uhess.

While optim can be used recursively, and for a single parameter as well as many, this may not be true for optimx. optim also accepts a zero-length par, and just evaluates the function with that argument. Such usage is strongly discouraged for optimx.

Method "nlm" is from the package of the same name that implements ideas of Dennis and Schnabel (1983) and Schnabel et al. (1985). See nlm() for more details.

Method "nlminb" is the package of the same name that uses the minimization tools of the PORT library. The PORT documentation is at <URL: http://netlib.bell-labs.com/cm/cs/cstr/153.pdf>. See nlminb() for details. (Though there is very little information about the methods.)

Method "spg" is from package BB implementing a spectral projected gradient method for large-scale optimization with simple constraints due R adaptation, with significant modifications, by Ravi Varadhan, Johns Hopkins University (Varadhan and Gilbert, 2009), from the original FORTRAN code of Birgin, Martinez, and Raydan (2001).

Method "Rcgmin" is from the package of that name. It implements a conjugate gradient algorithm with the Yuan/Dai update (ref??) and also allows bounds constraints on the parameters. (Rcgmin also allows mask constraints – fixing individual parameters – but there is no interface from "optimx".)

Methods "bobyqa", "uobyqa" and "newuoa" are from the package "minqa" which implement optimization by quadratic approximation routines of the similar names due to M J D Powell (2009). See package minqa for details. Note that "uobyqa" and "newuoa" are for unconstrained minimization, while "bobyqa" is for box constrained problems.

Methods nmkb and hjkb are box-constrained versions of the Nelder-Mead implementation of Tim Kelley and the Hooke and Jeeves method. When unconstrained problems are presented, functions nmk and hjk are called, but these naming variations are NOT permitted in the vector of methods method in the call to optimx.

The control argument is a list that can supply any of the following components:

trace: Non-negative integer. If positive, tracing information on the progress of the optimization is produced. Higher values may produce more tracing information: for method "L-BFGS-B" there are six levels of tracing. trace = 0 gives no output (To understand exactly what these do see the source code: higher levels give more detail.)
follow.on: = TRUE or FALSE. If TRUE, and there are multiple methods, then the last set of parameters from one method is used as the starting set for the next.
save.failures: = TRUE if we wish to keep "answers" from runs where the method does not return conv==0. FALSE otherwise (default).
maximize: = TRUE if we want to maximize rather than minimize a function. (Default FALSE). Methods nlm, nlminb, ucminf cannot maximize a function, so the user must explicitly minimize and carry out the adjustment externally. However, there is a check to avoid usage of these codes when maximize is TRUE. See fnscale below for the method used in optim that we deprecate.
all.methods: = TRUE if we want to use all available (and suitable) methods.
kkt: =FALSE if we do NOT want to test the Kuhn, Karush, Tucker optimality conditions. The default is TRUE. However, because the Hessian computation may be very slow, we set kkt to be FALSE if there are more than than 50 parameters when the gradient function gr is not provided, and more than 500 parameters when such a function is specified. We return logical values KKT1 and KKT2 TRUE if first and second order conditions are satisfied approximately. Note, however, that the tests are sensitive to scaling, and users may need to perform additional verification. If kkt is FALSE but hessian is TRUE, then KKT1 is generated, but KKT2 is not.
sort.result: = TRUE if we want to have the results table sorted into reverse order of final objective function.
kkttol: = value to use to check for small gradient and negative Hessian eigenvalues. Default = .Machine$double.eps^(1/3)
kkt2tol: = Tolerance for eigenvalue ratio in KKT test of positive definite Hessian. Default same as for kkttol
starttests: = TRUE if we want to run tests of the function and parameters: feasibility relative to bounds, analytic vs numerical gradient, scaling tests, before we try optimization methods. Default is TRUE.
dowarn: = TRUE if we want warnings generated by optimx. Default is TRUE.
badval: = The value to set for the function value when try(fn()) fails.
Default is (0.5)*.Machine$double.xmax
axsearch.tries: = The number of times to try the axial search. If >1, then restarts will be attempted until axial search does not yield a lower function value. Default is 3. CAUTION: Be very careful using axial search when maximizing a function. It is much safer to define your own function that is (-1) * (function to be maximized) and to use minimization.

The following control elements apply only to some of the methods. The list may be incomplete. See individual packages for details.

fnscale: An overall scaling to be applied to the value of fn and gr during optimization. If negative, turns the problem into a maximization problem. Optimization is performed on fn(par)/fnscale. For methods from the set in optim(). Note potential conflicts with the control maximize. This also conflicts with the argument ps to the tools in ufn, ugr and uhess.
parscale: A vector of scaling values for the parameters. Optimization is performed on par/parscale and these should be comparable in the sense that a unit change in any element produces about a unit change in the scaled value.For optim. This also conflicts with the argument ps to the tools in ufn, ugr and uhess.
ndeps: A vector of step sizes for the finite-difference approximation to the gradient, on par/parscale scale. Defaults to 1e-3. For optim.
maxit: The maximum number of iterations. Defaults to 100 for the derivative-based methods, and 500 for "Nelder-Mead".
abstol: The absolute convergence tolerance. Only useful for non-negative functions, as a tolerance for reaching zero.
reltol: Relative convergence tolerance. The algorithm stops if it is unable to reduce the value by a factor of reltol * (abs(val) + reltol) at a step. Defaults to sqrt(.Machine$double.eps), typically about 1e-8. For optim.
alpha, beta, gamma: Scaling parameters for the "Nelder-Mead" method. alpha is the reflection factor (default 1.0), beta the contraction factor (0.5) and gamma the expansion factor (2.0).
REPORT: The frequency of reports for the "BFGS" and "L-BFGS-B" methods if control$trace is positive. Defaults to every 10 iterations for "BFGS" and "L-BFGS-B".
type: for the conjugate-gradients method. Takes value 1 for the Fletcher–Reeves update, 2 for Polak–Ribiere and 3 for Beale–Sorenson.
lmm: is an integer giving the number of BFGS updates retained in the "L-BFGS-B" method, It defaults to 5.
factr: controls the convergence of the "L-BFGS-B" method. Convergence occurs when the reduction in the objective is within this factor of the machine tolerance. Default is 1e7, that is a tolerance of about 1e-8.
pgtol: helps control the convergence of the "L-BFGS-B" method. It is a tolerance on the projected gradient in the current search direction. This defaults to zero, when the check is suppressed.

Any names given to par will be copied to the vectors passed to fn and gr. Note that no other attributes of par are copied over. (We have not verified this as at 2011-11-02.)

ansout <- data.frame(cbind(method=meths ,par=pars, fvalues=vals,
fns=fevals, grs=gevals, itns=nitns, conv=convcode, KKT1=kkt1,
KKT2=kkt2, xtimes=xtimes, meths=meths))

A dataframe with the following columns, each having one element (row) for each each of the methods that has returned an answer. In each row the columne elements give:

`method`	The name of the method used for this answer.
`par`	The best set of parameters found.
`fvalues`	The value of `fn` corresponding to `par`.
`fns`	The number of calls to `fn`.
`grs`	The number of calls to `gr`. This excludes those calls needed to compute the Hessian, if requested, and any calls to `fn` to compute a finite-difference approximation to the gradient.
`itns`	For those methods where it is reported, the number of “iterations”. See the documentation or code for particular methods for the meaning of such counts.
`conv`	An integer code. `0` indicates successful convergence. Various methods may or may not return sufficient information to allow all the codes to be specified. An incomplete list of codes includes `-1` indicates that the function could NOT be computed at the initial point. `-2` indicates that the bounds are inadmissible, i.e., at least one lower bound exceeds the corresponding upper bound. `-3` indicates that at least one of the parameters is out of bounds (and the call has been made with control `allowmv2bound` = FALSE). `-4` indicates that the initial function returned was NOT a scalar. `-5` indicates that the gradient test has failed when there is a function specified to compute the gradient. `-6` indicates that the Hessian test has failed when there is a function specified to compute the Hessian. `1` indicates that the iteration limit `maxit` had been reached. `3` indicates that the axial search failed. `20` indicates that the initial set of parameters is inadmissible, that is, that the function cannot be computed or returns an infinite, NULL, or NA value. `21` indicates that an intermediate set of parameters is inadmissible. `10` indicates degeneracy of the Nelder–Mead simplex. `51` indicates a warning from the `"L-BFGS-B"` method; see component `message` for further details. `52` indicates an error from the `"L-BFGS-B"` method; see component `message` for further details.
`KKT1`	A logical value returned TRUE if the solution reported has a “small” gradient.
`KKT2`	A logical value returned TRUE if the solution reported appears to have a positive-definite Hessian.
`xtimes`	The reported execution time of the calculations for the particular method.
`meths`	A copy of the first column (method) to aid in reading a printout.

More detail is provided in the attribute "details" to the returned answer object. If the returned object from optimx() is ans, this is accessed via the construct
attr(ans, "details")
This object is a list of lists, one list for each method that has been successful or that has been forced by save.failures=TRUE. To get the details list for the third method, we use
attr(ans, "details")[[3]]]]

Each such list has possible elements:

`par`	- the final parameters of the function
`value`	- the final value of the objective function
`convergence`	- a code indicating whether and/or how the method terminated
`message`	- a descriptive message, if one is provided
`fevals`	- the number of times the objective function was evaluated during the run; this item has an attribute "names" that can take on the value "function"
`gevals`	- the number of times the gradient of the objective function was evaluated during the run; this item has an attribute "names" that can take on the value "gradient"
`kkt1`	- an indicator if the first order Kuhn, Karush, Tucker condition for a local minimum is satisfied. Note that we must use a tolerance to make this test, and the indicator is NOT definitive.
`kkt2`	- an indicator if the second order Kuhn, Karush, Tucker condition for a local minimum is satisfied. Note that we must use a tolerance to make this test, and the indicator is NOT definitive.
`ngatend`	- the gradient vector computed or estimated at termination
`nhatend`	- the Hessian matrix computed or estimated at termination
`evnhatend`	- a vector of the eigenvalues of the Hessian at termination
`systime`	- the system time required for execution of the method in question (named in "method" below). This element has the attribute "names".
`method`	- The name of the method used. This has the attribute "CPU times (s)" which itself has an attribute "names" that corresponds to the "method".

optimx will work with one-dimensional pars, but the default method does not work well (and will warn). Use optimize instead.

There are a series of demos available. These were set up as tests, but take quite a long time to execute. Once the package is loaded (via require(optimx) or library(optimx), you may see available demos via

demo(package="optimx")

The demo 'brown_test' may be run with the command demo(brown_test, package="optimx")

The code for methods "Nelder-Mead", "BFGS" and "CG" was based originally on Pascal code in Nash (1990) that was translated by p2c and then hand-optimized. As author, I (J C Nash) have agreed that the code can be made freely available. However, I encourage the use of some of the later methods e.g., "Rcgmin", "Rvmmin", and "nmkb".

The code for method "L-BFGS-B" is based on Fortran code by Zhu, Byrd, Lu-Chen and Nocedal obtained from Netlib (file ‘opt/lbfgs\_bcm.shar’: another version is in ‘toms/778’).

See documentation for "nlm" and "nlminb" for those methods, package "ucminf" for the function of the same name, and package BB for method "spg".

Belisle, C. J. P. (1992) Convergence theorems for a class of simulated annealing algorithms on Rd. J Applied Probability, 29, 885–895.

Birgin EG, Martinez JM, and Raydan M (2001): SPG: software for convex-constrained optimization, ACM Transactions on Mathematical Software. ??incomplete reference??

Byrd, R. H., Lu, P., Nocedal, J. and Zhu, C. (1995) A limited memory algorithm for bound constrained optimization. SIAM J. Scientific Computing, 16, 1190–1208.

Dennis, J. E. and Schnabel, R. B. (1983) _Numerical Methods for Unconstrained Optimization and Nonlinear Equations._ Prentice-Hall, Englewood Cliffs, NJ.

Fletcher, R. and Reeves, C. M. (1964) Function minimization by conjugate gradients. Computer Journal 7, 148–154.

Fletcher, R (1970) A new approach to variable metric algorithms, Computer Journal, 13, 317-322.

Nash, J. C. (1979, 1990) Compact Numerical Methods for Computers. Linear Algebra and Function Minimisation. Adam Hilger.

Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal 7, 308–313.

Nocedal, J. and Wright, S. J. (1999) Numerical Optimization. Springer.

Schnabel, R. B., Koontz, J. E. and Weiss, B. E. (1985) A modular system of algorithms for unconstrained minimization. _ACM Trans. Math. Software_, *11*, 419-440.

Varadhan, R. and Gilbert, P.D. (2009) BB: An R Package for Solving a Large System of Nonlinear Equations and for Optimizing a High-Dimensional Nonlinear Objective Function. Journal of Statistical Software, 32, 1–26.

spg, nlm, nlminb, bobyqa, Rcgmin, Rvmmin, ucminf, dfoptim.

optimize for one-dimensional minimization; constrOptim or spg for linearly constrained optimization.

require(graphics)

cat("Test warning on 1-parameter problems\n")
quad<-function(x){
    qq<-1:length(x)
    fn<-as.numeric(crossprod((x-qq)/qq))
}
quad.g<-function(x){
    qq<-1:length(x)
    gg<-2*(x-qq)/(qq*qq)
}
x<-0.1234
a1par<-optimx(x, quad, quad.g, method="ALL")
optansout(a1par, NULL)

xx<-runif(2)
a2uobyqa<-optimx(xx, quad, method="uobyqa")
a2uobyqa

fr <- function(x) {   ## Rosenbrock Banana function
    x1 <- x[1]
    x2 <- x[2]
    100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
grr <- function(x) { ## Gradient of 'fr'
    x1 <- x[1]
    x2 <- x[2]
    c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
       200 *      (x2 - x1 * x1))
}
cat("Rosenbrock function, with and without analytic gradients\n")

ans1<-optimx(c(-1.2,1), fr, control=list(all.methods=TRUE))
cat("Default numerical gradients via grfwd\n")
print(ans1)
# print(attr(ans1,"details"))
## Not run: 
cat("\n\n")
cat("Analytic gradients\n")
ans2<-optimx(c(-1.2,1), fr, grr, control=list(all.methods=TRUE))
print(ans2)

## End(Not run)

cat("\n\n")
cat("The call will mask (or fix) the second parameter at 1.\n")
cat("This call will WARN that we cannot use BFGS for masks.\n")
try4<-try(ans4<-optimx(c(-1.2,1), fr, NULL, method = "BFGS", bdmsk=c(1,0)))
print(ans4)


cat("\n\n")
cat("Single method call -- note control for CG is 'type=...'\n")
ans5a<-optimx(c(-1.2,1), fr, grr, method = "CG", control=list(type=1, axsearch.tries=10, trace=0))
print(ans5a)
ans5b<-optimx(c(-1.2,1), fr, grr, method = "CG", control=list(type=2, axsearch.tries=10, trace=0))
print(ans5b)
ans5c<-optimx(c(-1.2,1), fr, grr, method = "CG", control=list(type=3, axsearch.tries=10, trace=0))
print(ans5c)
cat("\n\n")


cat("25-dimensional box constrained function\n")
cat("Note: nmkb fails. Starting vector must have no parameter on bound.\n")
flb <- function(x)
    { p <- length(x); sum(c(1, rep(4, p-1)) * (x - c(1, x[-p])^2)^2) }

ans7<-optimx(rep(3, 25), flb, lower=rep(2, 25), upper=rep(4, 25),
        control=list(all.methods=TRUE)) # par[24] is *not* at boundary
optansout(ans7, NULL)


cat("Check on maximization\n")

maxfn<-function(x) {
      	n<-length(x)
	ss<-seq(1,n)
	f<-10-(crossprod(x-ss))^2
	f<-as.numeric(f)
	return(f)
}

## Not run: 
negmaxfn<-function(x) { # Can minimize this as an alternative
	f<-(-1)*maxfn(x)
	return(f)
}

## End(Not run)

x0<-rep(pi,4)
cat("maximization using standard forward difference numerical derivatives\n")
ans.mx<-optimx(x0,maxfn,gr="grfwd", control=list(maximize=TRUE,all.methods=TRUE,save.failures=TRUE,trace=0))
print(ans.mx)

## Not run: 
cat("verify using explicit negation\n")
ans.mxn<-optimx(x0,negmaxfn,control=list(all.methods=TRUE,save.failures=TRUE,trace=0))
print(ans.mxn)
x0bn<-get.best(ans.mxn)
print(x0bn)
as.x0bn<-axsearch(x0bn$par, negmaxfn, x0bn$value)
print(as.x0bn)
cat("\n\n")

## End(Not run)


## Not run: 
x0b<-get.best(ans.mx, maximize=TRUE)
cat("trying get.best for maxfn\n")
print(x0b)
cat("axial search on x0b\n")
cat("Need to be very careful with axial search when maximizing.\n")
cat("## WRONG ##: as.x0b<-axsearch(x0b, maxfn, x0b$value)\n")
cat("\n Here is how to do it.\n\n")
cat("First define the user function working environment.\n")
mymax<-list2env(list(fn=maxfn, gr=NULL, hess=NULL, MAXIMIZE=TRUE, KFN=0, KGR=0, KHESS=0, 
    PARSCALE=rep(1,length(x0b$par)), FNSCALE=1, dots=NULL))
cat("mymax<-list2env(list(fn=maxfn, gr=NULL, hess=NULL, MAXIMIZE=TRUE, KFN=0, KGR=0, KHESS=0, \n")
cat("    PARSCALE=rep(1,length(x0b$par)), FNSCALE=1, dots=NULL))\n")
cat("\nThen call axial search. Note that the base function value is entered NEGATIVE.\n")
cat("Also that axsearch() always works in the context of MINIMIZING.\n")
cat("as.x0b<-axsearch(x0b$par, ufn, -x0b$value, trace=0, fnuser=mymax)\n")
as.x0b<-axsearch(x0b$par, ufn, -x0b$value, trace=0, fnuser=mymax)
print(as.x0b)

## End(Not run)

## Not run: 
x00<-c(1,2,3,4)
cat("Test if things work when we provide the solution!\n")
ans.mx0<-optimx(x0,maxfn,control=list(all.methods=TRUE,maximize=TRUE,save.failures=TRUE,trace=0))
optansout(ans.mx0, filename="./ansmx0.txt")
x00b<-get.best(ans.mx0)
cat("trying get.best for maxfn from solution\n")
print(x00b)
cat("\n\n")

## End(Not run)


# genrosa function code -- matches rosenbrock when npar=2 and gs=100
genrosa.f<- function(x, gs=NULL){ # objective function
## One generalization of the Rosenbrock banana valley function (n parameters)
	n <- length(x)
        if(is.null(gs)) { gs=100.0 }
        # Note do not at 1.0 so min at 0
	fval<-sum (gs*(x[1:(n-1)]^2 - x[2:n])^2 + (x[1:(n-1)] - 1)^2)
}

genrosa.g <- function(x, gs=NULL){
# vectorized gradient for genrosa.f
# Ravi Varadhan 2009-04-03
	n <- length(x)
        if(is.null(gs)) { gs=100.0 }
	gg <- as.vector(rep(0, n))
	tn <- 2:n
	tn1 <- tn - 1
	z1 <- x[tn] - x[tn1]^2
	z2 <- 1 - x[tn1]
        # f = gs*z1*z1 + z2*z2
	gg[tn] <- 2 * (gs * z1)
	gg[tn1] <- gg[tn1] - 4 * gs * x[tn1] * z1 - 2 *z2 
	return(gg)
}

genrosa.h <- function(x, gs=NULL) { ## compute Hessian
   if(is.null(gs)) { gs=100.0 }
	n <- length(x)
	hh<-matrix(rep(0, n*n),n,n)
	for (i in 2:n) {
		z1<-x[i]-x[i-1]*x[i-1]
#		z2<-1.0 - x[i-1]
                hh[i,i]<-hh[i,i]+2.0*(gs+1.0)
                hh[i-1,i-1]<-hh[i-1,i-1]-4.0*gs*z1-4.0*gs*x[i-1]*(-2.0*x[i-1])
                hh[i,i-1]<-hh[i,i-1]-4.0*gs*x[i-1]
                hh[i-1,i]<-hh[i-1,i]-4.0*gs*x[i-1]
	}
        return(hh)
}

cat("Use genrosa to show dots, follow-on etc.\n")
startx<-4*seq(1:10)/3.
ans8<-optimx(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, control=list(all.methods=TRUE, save.failures=TRUE, trace=0), gs=10)
print(ans8)

## Not run: 
get.result(ans8, attribute="grs")
get.result(ans8, method="spg")
cat("\n\n")

## End(Not run)

## Not run: 
cat("Try follow-on\n")
startx<-4*seq(1:10)/3.
cat("Polyalgorithm with 200 steps NM followed by up to 75 of ucminf\n")
ans9a<-optimx(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, 
             method=c("Nelder-Mead","ucminf"), itnmax=c(200,75), 
             control=list(follow.on=TRUE, save.failures=TRUE,trace=0), gs=10)
print(ans9a)
cat("\n\n")

## End(Not run)

cat("Try follow-on\n")
startx<-4*seq(1:10)/3.
cat("Polyalgorithm with 200 steps NM followed by up to 75 of Rvmmin\n")
ans9<-optimx(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, 
             method=c("Nelder-Mead","Rvmmin"), itnmax=c(200,75), 
             control=list(follow.on=TRUE, save.failures=TRUE,trace=0), gs=10)
print(ans9)
cat("\n\n")

## Not run: 
cat("A case where methods do not work and we do not save failures\n")
startx<-4*seq(1:10)/3.
cat("200 steps NM is not enough to terminate\n")
ans10<-optimx(startx,fn=genrosa.f,gr=genrosa.g, method=c("Nelder-Mead"),
             itnmax=c(200), control=list(trace=0, save.failures=FALSE), gs=10)
cat("The answer should be NULL\n")
print(str(ans10))
cat("\n\n")

## End(Not run)

## Not run: 
startx<-4*seq(1:10)/3.
cat("Try getting hessian but not kkt tests -- no results!\n")
ans11<-optimx(startx,fn=genrosa.f,gr=genrosa.g, hessian=TRUE,  
    control=list(all.methods=TRUE, trace=0, save.failures=FALSE, kkt=FALSE),
    gs=10)
print(ans11)
print(attr(ans11,"details")[7])
cat("\n\n")

## End(Not run)

## Not run: 
startx<-4*seq(1:10)/3.
cat("Use analytic hessian and no KKT tests\n")
ans12<-optimx(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, 
      control=list(trace=0, save.failures=FALSE, kkt=FALSE), gs=10)
print(ans12)
print(attr(ans12,"details"))

## End(Not run)

cat("Example with bounds and masks\n")
startx<-4*seq(1:10)/7.
lo<-rep(0.5,10)
up<-rep(8,10)
bm<-c(1,1,1,1,1,0,0,1,1,1)
ans13<-optimx(startx,fn=genrosa.f,gr=genrosa.g, lower=lo, upper=up, bdmsk=bm, 
       control=list(all.methods=TRUE, save.failures=TRUE, trace=0), gs=4)
print(ans13)
cat("\n\n")



# ======= check names =========
cyq.f <- function (x) {
  rv<-cyq.res(x)
  f<-sum(rv*rv)
}

cyq.res <- function (x) {
# Fletcher's chebyquad function m = n -- residuals 
   n<-length(x)
   res<-rep(0,n) # initialize
   for (i in 1:n) { #loop over resids
     rr<-0.0
     for (k in 1:n) {
	z7<-1.0
	z2<-2.0*x[k]-1.0
        z8<-z2
        j<-1
        while (j<i) {
            z6<-z7
            z7<-z8
            z8<-2*z2*z7-z6 # recurrence to compute Chebyshev polynomial
            j<-j+1
        } # end recurrence loop
        rr<-rr+z8
      } # end loop on k
      rr<-rr/n
      if (2*trunc(i/2) == i) { rr <- rr + 1.0/(i*i - 1) }
      res[i]<-rr
    } # end loop on i
    res
}

cyq.jac<- function (x) {
#  Chebyquad Jacobian matrix
   n<-length(x)
   cj<-matrix(0.0, n, n)
   for (i in 1:n) { # loop over rows
     for (k in 1:n) { # loop over columns (parameters)
       z5<-0.0
       cj[i,k]<-2.0
       z8<-2.0*x[k]-1.0 
       z2<-z8
       z7<-1.0
       j<- 1
       while (j<i) { # recurrence loop
         z4<-z5
         z5<-cj[i,k]
         cj[i,k]<-4.0*z8+2.0*z2*z5-z4
         z6<-z7
         z7<-z8
         z8<-2.0*z2*z7-z6
         j<- j+1
       } # end recurrence loop
       cj[i,k]<-cj[i,k]/n
     } # end loop on k
   } # end loop on i
   cj
}


cyq.g <- function(x) {
   cj<-cyq.jac(x)
   rv<-cyq.res(x)
   # Need to escape the % in .Rd file
   gg<- as.vector(2.0* rv %*% cj)
}

## Not run:  # Use for copy and paste
cyq.g <- function(x) {
   cj<-cyq.jac(x)
   rv<-cyq.res(x)
   # Need to escape the % in .Rd file
   gg<- as.vector(2.0* rv 
}

## End(Not run)


startx<-c(0.2, 0.6)
names(startx)<-c("First","Second")
cat("check the names on the parameters\n")
print(startx)
ansnames<-optimx(startx, cyq.f, cyq.g, control=list(all.methods=TRUE))
ansnames
cat("check the names on the parameters\n")
print(ansnames$method)
print(ansnames$par)
bestans<-get.best(ansnames)
cat("The get.best result\n")
print(bestans)
cat("\n\n")

## Not run: 
startx<-c(0.2, 0.4, 0.6, 0.8)
cat("check Hessian -- no bounds\n")
# should check bounds, control.kkt=FALSE, etc.
anshes<-optimx(startx,cyq.f, cyq.g, hessian=TRUE, control=list(all.methods=TRUE))
# hesBFGS<-get.result(anshes, attribute="nhatend")
adet<-attr(anshes,"details")
nmeth<-length(adet)
for (i in 1:nmeth){
   print(adet[[i]]$method)
   print(adet[[i]]$nhatend)
   print(adet[[i]]$evnhatend)
   cat("=======================================\n")
}

## End(Not run)