Description Usage Arguments Details Value Note Source References See Also Examples
General-purpose optimization wrapper function that replaces the
default optim()
function.
1 2 3 4 |
par |
A vector of initial values for the parameters for which optimal values are to be found. |
fn |
A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result. |
gr |
A function to return (as a vector) the gradient for those methods that can use this information. If 'gr' is |
lower, upper |
Bounds on the variables for methods such as |
method |
A list of the methods to be used. Note that this is an important change from optim() that allows just one method to be specified. See ‘Details’. |
itnmax |
If provided as a vector of the same length as the list of methods |
hessian |
A logical control that if TRUE forces the computation of an approximation
to the Hessian at the final set of parameters. If FALSE (default), the hessian is
calculated if needed to provide the KKT optimality tests (see |
control |
A list of control parameters. See ‘Details’. |
... |
Further arguments to be passed to |
Note that arguments after ...
must be matched exactly.
By default this function performs minimization, but it will maximize
if control$maximize
is TRUE. The original optim() function allows
control$fnscale
to be set negative to accomplish this. DO NOT
use both methods.
Possible method codes at the time of writing are 'Nelder-Mead', 'BFGS', 'CG', 'L-BFGS-B'.
Note: 'SANN' is allowed in the default optim()
but is not available
here.
The control
argument is a list that can supply any of the
following components:
trace
Non-negative integer. If positive,
tracing information on the
progress of the optimization is produced. Higher values may
produce more tracing information: for method "L-BFGS-B"
there are six levels of tracing. trace = 0 gives no output
(To understand exactly what these do see the source code: higher
levels give more detail.)
maximize
= TRUE if we want to maximize rather than minimize
a function. (Default FALSE). Methods nlm, nlminb, ucminf cannot maximize a
function, so the user must explicitly minimize and carry out the adjustment
externally. However, there is a check to avoid
usage of these codes when maximize is TRUE. See fnscale
below for
the method used in optim
that we deprecate.
starttests
= TRUE if we want to run tests of the function and parameters: feasibility relative to bounds, analytic vs numerical gradient, scaling tests, before we try optimization methods. Default is TRUE. ???
dowarn
= TRUE if we want warnings generated by optim. Default is TRUE.
badval
= The value to set for the function value when try(fn()) fails.
Default is (0.5)*.Machine$double.xmax
The following control
elements apply only to some of the methods. The list
may be incomplete. See individual packages for details.
fnscale
An overall scaling to be applied to the value
of fn
and gr
during optimization. If negative,
turns the problem into a maximization problem. Optimization is
performed on fn(par)/fnscale
. For methods from the set in
optim()
. Note potential conflicts with the control maximize
that is used in some optimizers.
parscale
This vector of scaling values for the parameters
is NOT available in package optreplace
, as its use has a quite
high performance cost.
maxit
The maximum number of iterations. Defaults to
100
for the derivative-based methods, and
500
for "Nelder-Mead"
.
abstol
The absolute convergence tolerance. Only useful for non-negative functions, as a tolerance for reaching zero. ???
reltol
Relative convergence tolerance. The algorithm
stops if it is unable to reduce the value by a factor of
reltol * (abs(val) + reltol)
at a step. Defaults to
sqrt(.Machine$double.eps)
, typically about 1e-8
. For optim
. ???
alpha
, beta
, gamma
Scaling parameters
for the "Nelder-Mead"
method. alpha
is the reflection
factor (default 1.0), beta
the contraction factor (0.5) and
gamma
the expansion factor (2.0). ???
Any names given to par
will be copied to the vectors passed to
fn
and gr
. Note that no other attributes of par
are copied over. (We have not verified this as at 2011-11-02.)
ans
is a list of the elements:
- the final parameters of the function
- the final value of the objective function
- a code indicating whether and/or how the method terminated
- a descriptive message, if one is provided
- the number of times the objective function was evaluated during the run; this item has an attribute "names" that can take on the value "function"
- the number of times the gradient of the objective function was evaluated during the run; this item has an attribute "names" that can take on the value "gradient"
optim
will work with one-dimensional par
s, but the
default method does not work well (and will warn). Use
optimize
instead.
There are a series of demos available. These were set up as tests, but take quite
a long time to execute. Once the package is loaded (via require(optim)
or
library(optim)
, you may see available demos via
demo(package="optim")
The demo 'brown_test' may be run with the command demo(brown_test, package="optim")
To be added.
Belisle, C. J. P. (1992) Convergence theorems for a class of simulated annealing algorithms on Rd. J Applied Probability, 29, 885–895.
Birgin EG, Martinez JM, and Raydan M (2001): SPG: software for convex-constrained optimization, ACM Transactions on Mathematical Software. ??incomplete reference??
Byrd, R. H., Lu, P., Nocedal, J. and Zhu, C. (1995) A limited memory algorithm for bound constrained optimization. SIAM J. Scientific Computing, 16, 1190–1208.
Dennis, J. E. and Schnabel, R. B. (1983) _Numerical Methods for Unconstrained Optimization and Nonlinear Equations._ Prentice-Hall, Englewood Cliffs, NJ.
Fletcher, R. and Reeves, C. M. (1964) Function minimization by conjugate gradients. Computer Journal 7, 148–154.
Fletcher, R (1970) A new approach to variable metric algorithms, Computer Journal, 13, 317-322.
Nash, J. C. (1979, 1990) Compact Numerical Methods for Computers. Linear Algebra and Function Minimisation. Adam Hilger.
Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal 7, 308–313.
Nocedal, J. and Wright, S. J. (1999) Numerical Optimization. Springer.
Schnabel, R. B., Koontz, J. E. and Weiss, B. E. (1985) A modular system of algorithms for unconstrained minimization. _ACM Trans. Math. Software_, *11*, 419-440.
Varadhan, R. and Gilbert, P.D. (2009) BB: An R Package for Solving a Large System of Nonlinear Equations and for Optimizing a High-Dimensional Nonlinear Objective Function. Journal of Statistical Software, 32, 1–26.
optimize
for one-dimensional minimization.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 | require(graphics)
cat("Test warning on 1-parameter problems\n")
quad<-function(x){
qq<-1:length(x)
fn<-as.numeric(crossprod((x-qq)/qq))
}
quad.g<-function(x){
qq<-1:length(x)
gg<-2*(x-qq)/(qq*qq)
}
x<-c(1.23, 4.32)
a2pardef<-optim(x, quad, quad.g, control=list(trace=3))
print(a2pardef)
a2bfgs<-optim(x, quad, quad.g, method='BFGS')
print(a2bfgs)
## Not run:
x<-0.1234
a1par<-optim(x, quad, quad.g, method="ALL")
print(a1par)
## End(Not run)
fr <- function(x) { ## Rosenbrock Banana function
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
grr <- function(x) { ## Gradient of 'fr'
x1 <- x[1]
x2 <- x[2]
c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
200 * (x2 - x1 * x1))
}
cat("Rosenbrock function, with and without analytic gradients\n")
ans1<-optim(c(-1.2,1), fr)
print(ans1)
# print(attr(ans1,"details"))
## Not run:
cat("\n\n")
cat("Analytic gradients\n")
ans2<-optim(c(-1.2,1), fr, grr, method='BFGS')
print(ans2)
## End(Not run)
cat("\n\n")
cat("CG ignore 'type=...'\n")
ans5<-optim(c(-1.2,1), fr, grr, method = "CG", control=list(trace=0))
print(ans5)
cat("\n\n")
cat("25-dimensional box constrained function\n")
cat("Note: nmkb fails. Starting vector must have no parameter on bound.\n")
flb <- function(x)
{ p <- length(x); sum(c(1, rep(4, p-1)) * (x - c(1, x[-p])^2)^2) }
ans7<-optim(rep(3, 25), flb, lower=rep(2, 25), upper=rep(4, 25)) # par[24] is *not* at boundary
print(ans7)
cat("Check on maximization\n")
maxfn<-function(x) {
n<-length(x)
ss<-seq(1,n)
f<-10-(crossprod(x-ss))^2
f<-as.numeric(f)
return(f)
}
negmaxfn<-function(x) { # Can minimize this as an alternative
f<-(-1)*maxfn(x)
return(f)
}
x0<-rep(pi,4)
cat("maximization using standard forward difference numerical derivatives\n")
ans.mx<-optim(x0,maxfn, gr=NULL, control=list(maximize=TRUE, trace=0))
print(ans.mx)
cat("verify using explicit negation\n")
ans.mxn<-optim(x0,negmaxfn)
print(ans.mxn)
cat("\n\n")
# genrosa function code -- matches rosenbrock when npar=2 and gs=100
genrosa.f<- function(x, gs=NULL){ # objective function
## One generalization of the Rosenbrock banana valley function (n parameters)
n <- length(x)
if(is.null(gs)) { gs=100.0 }
# Note do not at 1.0 so min at 0
fval<-sum (gs*(x[1:(n-1)]^2 - x[2:n])^2 + (x[1:(n-1)] - 1)^2)
}
genrosa.g <- function(x, gs=NULL){
# vectorized gradient for genrosa.f
# Ravi Varadhan 2009-04-03
n <- length(x)
if(is.null(gs)) { gs=100.0 }
gg <- as.vector(rep(0, n))
tn <- 2:n
tn1 <- tn - 1
z1 <- x[tn] - x[tn1]^2
z2 <- 1 - x[tn1]
# f = gs*z1*z1 + z2*z2
gg[tn] <- 2 * (gs * z1)
gg[tn1] <- gg[tn1] - 4 * gs * x[tn1] * z1 - 2 *z2
return(gg)
}
genrosa.h <- function(x, gs=NULL) { ## compute Hessian
if(is.null(gs)) { gs=100.0 }
n <- length(x)
hh<-matrix(rep(0, n*n),n,n)
for (i in 2:n) {
z1<-x[i]-x[i-1]*x[i-1]
# z2<-1.0 - x[i-1]
hh[i,i]<-hh[i,i]+2.0*(gs+1.0)
hh[i-1,i-1]<-hh[i-1,i-1]-4.0*gs*z1-4.0*gs*x[i-1]*(-2.0*x[i-1])
hh[i,i-1]<-hh[i,i-1]-4.0*gs*x[i-1]
hh[i-1,i]<-hh[i-1,i]-4.0*gs*x[i-1]
}
return(hh)
}
cat("Use genrosa to show dots, follow-on etc.\n")
startx<-4*seq(1:10)/3.
ans8<-optim(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, control=list(all.methods=TRUE, save.failures=TRUE, trace=0), gs=10)
print(ans8)
## Not run:
get.result(ans8, attribute="grs")
get.result(ans8, method="spg")
cat("\n\n")
## End(Not run)
## Not run:
cat("Try follow-on\n")
startx<-4*seq(1:10)/3.
cat("Polyalgorithm with 200 steps NM followed by up to 75 of ucminf\n")
ans9a<-optim(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h,
method=c("Nelder-Mead","ucminf"), itnmax=c(200,75),
control=list(follow.on=TRUE, save.failures=TRUE,trace=0), gs=10)
print(ans9a)
cat("\n\n")
## End(Not run)
cat("Try follow-on\n")
startx<-4*seq(1:10)/3.
cat("Polyalgorithm with 200 steps NM followed by up to 75 of Rvmmin\n")
ans9<-optim(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h,
method=c("Nelder-Mead","Rvmmin"), itnmax=c(200,75),
control=list(follow.on=TRUE, save.failures=TRUE,trace=0), gs=10)
print(ans9)
cat("\n\n")
## Not run:
cat("A case where methods do not work and we do not save failures\n")
startx<-4*seq(1:10)/3.
cat("200 steps NM is not enough to terminate\n")
ans10<-optim(startx,fn=genrosa.f,gr=genrosa.g, method=c("Nelder-Mead"),
itnmax=c(200), control=list(trace=0, save.failures=FALSE), gs=10)
cat("The answer should be NULL\n")
print(str(ans10))
cat("\n\n")
## End(Not run)
## Not run:
startx<-4*seq(1:10)/3.
cat("Try getting hessian but not kkt tests -- no results!\n")
ans11<-optim(startx,fn=genrosa.f,gr=genrosa.g, hessian=TRUE,
control=list(all.methods=TRUE, trace=0, save.failures=FALSE, kkt=FALSE),
gs=10)
print(ans11)
print(attr(ans11,"details")[7])
cat("\n\n")
## End(Not run)
## Not run:
startx<-4*seq(1:10)/3.
cat("Use analytic hessian and no KKT tests\n")
ans12<-optim(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h,
control=list(trace=0, save.failures=FALSE, kkt=FALSE), gs=10)
print(ans12)
print(attr(ans12,"details"))
## End(Not run)
cat("Example with bounds and masks\n")
startx<-4*seq(1:10)/7.
lo<-rep(0.5,10)
up<-rep(8,10)
ans13<-optim(startx,fn=genrosa.f,gr=genrosa.g, lower=lo, upper=up,
control=list(trace=0), gs=4)
print(ans13)
cat("\n\n")
# ======= check names =========
cyq.f <- function (x) {
rv<-cyq.res(x)
f<-sum(rv*rv)
}
cyq.res <- function (x) {
# Fletcher's chebyquad function m = n -- residuals
n<-length(x)
res<-rep(0,n) # initialize
for (i in 1:n) { #loop over resids
rr<-0.0
for (k in 1:n) {
z7<-1.0
z2<-2.0*x[k]-1.0
z8<-z2
j<-1
while (j<i) {
z6<-z7
z7<-z8
z8<-2*z2*z7-z6 # recurrence to compute Chebyshev polynomial
j<-j+1
} # end recurrence loop
rr<-rr+z8
} # end loop on k
rr<-rr/n
if (2*trunc(i/2) == i) { rr <- rr + 1.0/(i*i - 1) }
res[i]<-rr
} # end loop on i
res
}
cyq.jac<- function (x) {
# Chebyquad Jacobian matrix
n<-length(x)
cj<-matrix(0.0, n, n)
for (i in 1:n) { # loop over rows
for (k in 1:n) { # loop over columns (parameters)
z5<-0.0
cj[i,k]<-2.0
z8<-2.0*x[k]-1.0
z2<-z8
z7<-1.0
j<- 1
while (j<i) { # recurrence loop
z4<-z5
z5<-cj[i,k]
cj[i,k]<-4.0*z8+2.0*z2*z5-z4
z6<-z7
z7<-z8
z8<-2.0*z2*z7-z6
j<- j+1
} # end recurrence loop
cj[i,k]<-cj[i,k]/n
} # end loop on k
} # end loop on i
cj
}
cyq.g <- function(x) {
cj<-cyq.jac(x)
rv<-cyq.res(x)
# Need to escape the % in .Rd file
gg<- as.vector(2.0* rv %*% cj)
}
## Not run: # Use for copy and paste
cyq.g <- function(x) {
cj<-cyq.jac(x)
rv<-cyq.res(x)
# Need to escape the % in .Rd file
gg<- as.vector(2.0* rv
}
## End(Not run)
startx<-c(0.2, 0.6)
names(startx)<-c("First","Second")
cat("check the names on the parameters\n")
print(startx)
ansnames<-optim(startx, cyq.f, cyq.g, method='L-BFGS-B')
ansnames
cat("check the names on the parameters\n")
print(ansnames$method)
print(ansnames$par)
cat("\n\n")
startx<-c(0.2, 0.4, 0.6, 0.8)
cat("check Hessian -- no bounds\n")
# should check bounds, control.kkt=FALSE, etc.
anshes<-optim(startx,cyq.f, cyq.g, hessian=TRUE, method='BFGS')
adet<-attr(anshes,"details")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.