optim: General-purpose optimization - optreplace version

Description Usage Arguments Details Value Note Source References See Also Examples

View source: R/optim.R

Description

General-purpose optimization wrapper function that replaces the default optim() function.

Usage

1
2
3
4
optim(par, fn, gr=NULL, lower=-Inf, upper=Inf, 
            method="Nelder-Mead", itnmax=NULL, hessian=FALSE,
            control=list(),
             ...)

Arguments

par

A vector of initial values for the parameters for which optimal values are to be found.

fn

A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.

gr

A function to return (as a vector) the gradient for those methods that can use this information.

If 'gr' is NULL, a finite-difference approximation will be used.

lower, upper

Bounds on the variables for methods such as "L-BFGS-B" that can handle box (or bounds) constraints.

method

A list of the methods to be used. Note that this is an important change from optim() that allows just one method to be specified. See ‘Details’.

itnmax

If provided as a vector of the same length as the list of methods method, gives the maximum number of iterations or function values for the corresponding method. If a single number is provided, this will be used for all methods. Note that there may be control list elements with similar functions, but this should be the preferred approach when using optim.

hessian

A logical control that if TRUE forces the computation of an approximation to the Hessian at the final set of parameters. If FALSE (default), the hessian is calculated if needed to provide the KKT optimality tests (see kkt in ‘Details’ for the control list). This setting is provided primarily for compatibility with optim(). Note that there are extended possibilities in ucminf, where hessian is treated as an integer, but these, as well as the facility in optim are ignored, and the numerical hessian from the kktc package are used.

control

A list of control parameters. See ‘Details’.

...

Further arguments to be passed to fn and gr.

Details

Note that arguments after ... must be matched exactly.

By default this function performs minimization, but it will maximize if control$maximize is TRUE. The original optim() function allows control$fnscale to be set negative to accomplish this. DO NOT use both methods.

Possible method codes at the time of writing are 'Nelder-Mead', 'BFGS', 'CG', 'L-BFGS-B'.

Note: 'SANN' is allowed in the default optim() but is not available here.

The control argument is a list that can supply any of the following components:

trace

Non-negative integer. If positive, tracing information on the progress of the optimization is produced. Higher values may produce more tracing information: for method "L-BFGS-B" there are six levels of tracing. trace = 0 gives no output (To understand exactly what these do see the source code: higher levels give more detail.)

maximize

= TRUE if we want to maximize rather than minimize a function. (Default FALSE). Methods nlm, nlminb, ucminf cannot maximize a function, so the user must explicitly minimize and carry out the adjustment externally. However, there is a check to avoid usage of these codes when maximize is TRUE. See fnscale below for the method used in optim that we deprecate.

starttests

= TRUE if we want to run tests of the function and parameters: feasibility relative to bounds, analytic vs numerical gradient, scaling tests, before we try optimization methods. Default is TRUE. ???

dowarn

= TRUE if we want warnings generated by optim. Default is TRUE.

badval

= The value to set for the function value when try(fn()) fails.
Default is (0.5)*.Machine$double.xmax

The following control elements apply only to some of the methods. The list may be incomplete. See individual packages for details.

fnscale

An overall scaling to be applied to the value of fn and gr during optimization. If negative, turns the problem into a maximization problem. Optimization is performed on fn(par)/fnscale. For methods from the set in optim(). Note potential conflicts with the control maximize that is used in some optimizers.

parscale

This vector of scaling values for the parameters is NOT available in package optreplace, as its use has a quite high performance cost.

maxit

The maximum number of iterations. Defaults to 100 for the derivative-based methods, and 500 for "Nelder-Mead".

abstol

The absolute convergence tolerance. Only useful for non-negative functions, as a tolerance for reaching zero. ???

reltol

Relative convergence tolerance. The algorithm stops if it is unable to reduce the value by a factor of reltol * (abs(val) + reltol) at a step. Defaults to sqrt(.Machine$double.eps), typically about 1e-8. For optim. ???

alpha, beta, gamma

Scaling parameters for the "Nelder-Mead" method. alpha is the reflection factor (default 1.0), beta the contraction factor (0.5) and gamma the expansion factor (2.0). ???

Any names given to par will be copied to the vectors passed to fn and gr. Note that no other attributes of par are copied over. (We have not verified this as at 2011-11-02.)

Value

ans is a list of the elements:

par

- the final parameters of the function

value

- the final value of the objective function

convergence

- a code indicating whether and/or how the method terminated

message

- a descriptive message, if one is provided

fevals

- the number of times the objective function was evaluated during the run; this item has an attribute "names" that can take on the value "function"

gevals

- the number of times the gradient of the objective function was evaluated during the run; this item has an attribute "names" that can take on the value "gradient"

Note

optim will work with one-dimensional pars, but the default method does not work well (and will warn). Use optimize instead.

There are a series of demos available. These were set up as tests, but take quite a long time to execute. Once the package is loaded (via require(optim) or library(optim), you may see available demos via

demo(package="optim")

The demo 'brown_test' may be run with the command demo(brown_test, package="optim")

Source

To be added.

References

Belisle, C. J. P. (1992) Convergence theorems for a class of simulated annealing algorithms on Rd. J Applied Probability, 29, 885–895.

Birgin EG, Martinez JM, and Raydan M (2001): SPG: software for convex-constrained optimization, ACM Transactions on Mathematical Software. ??incomplete reference??

Byrd, R. H., Lu, P., Nocedal, J. and Zhu, C. (1995) A limited memory algorithm for bound constrained optimization. SIAM J. Scientific Computing, 16, 1190–1208.

Dennis, J. E. and Schnabel, R. B. (1983) _Numerical Methods for Unconstrained Optimization and Nonlinear Equations._ Prentice-Hall, Englewood Cliffs, NJ.

Fletcher, R. and Reeves, C. M. (1964) Function minimization by conjugate gradients. Computer Journal 7, 148–154.

Fletcher, R (1970) A new approach to variable metric algorithms, Computer Journal, 13, 317-322.

Nash, J. C. (1979, 1990) Compact Numerical Methods for Computers. Linear Algebra and Function Minimisation. Adam Hilger.

Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal 7, 308–313.

Nocedal, J. and Wright, S. J. (1999) Numerical Optimization. Springer.

Schnabel, R. B., Koontz, J. E. and Weiss, B. E. (1985) A modular system of algorithms for unconstrained minimization. _ACM Trans. Math. Software_, *11*, 419-440.

Varadhan, R. and Gilbert, P.D. (2009) BB: An R Package for Solving a Large System of Nonlinear Equations and for Optimizing a High-Dimensional Nonlinear Objective Function. Journal of Statistical Software, 32, 1–26.

See Also

Rcgmin, Rvmmin, dfoptim.

optimize for one-dimensional minimization.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
require(graphics)

cat("Test warning on 1-parameter problems\n")
quad<-function(x){
    qq<-1:length(x)
    fn<-as.numeric(crossprod((x-qq)/qq))
}
quad.g<-function(x){
    qq<-1:length(x)
    gg<-2*(x-qq)/(qq*qq)
}

x<-c(1.23, 4.32)
a2pardef<-optim(x, quad, quad.g, control=list(trace=3))
print(a2pardef)
a2bfgs<-optim(x, quad, quad.g, method='BFGS')
print(a2bfgs)


## Not run: 
x<-0.1234
a1par<-optim(x, quad, quad.g, method="ALL")
print(a1par)

## End(Not run)


fr <- function(x) {   ## Rosenbrock Banana function
    x1 <- x[1]
    x2 <- x[2]
    100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
grr <- function(x) { ## Gradient of 'fr'
    x1 <- x[1]
    x2 <- x[2]
    c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
       200 *      (x2 - x1 * x1))
}
cat("Rosenbrock function, with and without analytic gradients\n")

ans1<-optim(c(-1.2,1), fr)
print(ans1)
# print(attr(ans1,"details"))
## Not run: 
cat("\n\n")
cat("Analytic gradients\n")
ans2<-optim(c(-1.2,1), fr, grr, method='BFGS')
print(ans2)

## End(Not run)



cat("\n\n")
cat("CG ignore 'type=...'\n")
ans5<-optim(c(-1.2,1), fr, grr, method = "CG", control=list(trace=0))
print(ans5)
cat("\n\n")


cat("25-dimensional box constrained function\n")
cat("Note: nmkb fails. Starting vector must have no parameter on bound.\n")
flb <- function(x)
    { p <- length(x); sum(c(1, rep(4, p-1)) * (x - c(1, x[-p])^2)^2) }

ans7<-optim(rep(3, 25), flb, lower=rep(2, 25), upper=rep(4, 25)) # par[24] is *not* at boundary
print(ans7)


cat("Check on maximization\n")

maxfn<-function(x) {
      	n<-length(x)
	ss<-seq(1,n)
	f<-10-(crossprod(x-ss))^2
	f<-as.numeric(f)
	return(f)
}

negmaxfn<-function(x) { # Can minimize this as an alternative
	f<-(-1)*maxfn(x)
	return(f)
}


x0<-rep(pi,4)
cat("maximization using standard forward difference numerical derivatives\n")
ans.mx<-optim(x0,maxfn, gr=NULL, control=list(maximize=TRUE, trace=0))
print(ans.mx)

cat("verify using explicit negation\n")
ans.mxn<-optim(x0,negmaxfn)
print(ans.mxn)
cat("\n\n")

# genrosa function code -- matches rosenbrock when npar=2 and gs=100
genrosa.f<- function(x, gs=NULL){ # objective function
## One generalization of the Rosenbrock banana valley function (n parameters)
	n <- length(x)
        if(is.null(gs)) { gs=100.0 }
        # Note do not at 1.0 so min at 0
	fval<-sum (gs*(x[1:(n-1)]^2 - x[2:n])^2 + (x[1:(n-1)] - 1)^2)
}

genrosa.g <- function(x, gs=NULL){
# vectorized gradient for genrosa.f
# Ravi Varadhan 2009-04-03
	n <- length(x)
        if(is.null(gs)) { gs=100.0 }
	gg <- as.vector(rep(0, n))
	tn <- 2:n
	tn1 <- tn - 1
	z1 <- x[tn] - x[tn1]^2
	z2 <- 1 - x[tn1]
        # f = gs*z1*z1 + z2*z2
	gg[tn] <- 2 * (gs * z1)
	gg[tn1] <- gg[tn1] - 4 * gs * x[tn1] * z1 - 2 *z2 
	return(gg)
}

genrosa.h <- function(x, gs=NULL) { ## compute Hessian
   if(is.null(gs)) { gs=100.0 }
	n <- length(x)
	hh<-matrix(rep(0, n*n),n,n)
	for (i in 2:n) {
		z1<-x[i]-x[i-1]*x[i-1]
#		z2<-1.0 - x[i-1]
                hh[i,i]<-hh[i,i]+2.0*(gs+1.0)
                hh[i-1,i-1]<-hh[i-1,i-1]-4.0*gs*z1-4.0*gs*x[i-1]*(-2.0*x[i-1])
                hh[i,i-1]<-hh[i,i-1]-4.0*gs*x[i-1]
                hh[i-1,i]<-hh[i-1,i]-4.0*gs*x[i-1]
	}
        return(hh)
}

cat("Use genrosa to show dots, follow-on etc.\n")
startx<-4*seq(1:10)/3.
ans8<-optim(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, control=list(all.methods=TRUE, save.failures=TRUE, trace=0), gs=10)
print(ans8)

## Not run: 
get.result(ans8, attribute="grs")
get.result(ans8, method="spg")
cat("\n\n")

## End(Not run)

## Not run: 
cat("Try follow-on\n")
startx<-4*seq(1:10)/3.
cat("Polyalgorithm with 200 steps NM followed by up to 75 of ucminf\n")
ans9a<-optim(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, 
             method=c("Nelder-Mead","ucminf"), itnmax=c(200,75), 
             control=list(follow.on=TRUE, save.failures=TRUE,trace=0), gs=10)
print(ans9a)
cat("\n\n")

## End(Not run)

cat("Try follow-on\n")
startx<-4*seq(1:10)/3.
cat("Polyalgorithm with 200 steps NM followed by up to 75 of Rvmmin\n")
ans9<-optim(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, 
             method=c("Nelder-Mead","Rvmmin"), itnmax=c(200,75), 
             control=list(follow.on=TRUE, save.failures=TRUE,trace=0), gs=10)
print(ans9)
cat("\n\n")

## Not run: 
cat("A case where methods do not work and we do not save failures\n")
startx<-4*seq(1:10)/3.
cat("200 steps NM is not enough to terminate\n")
ans10<-optim(startx,fn=genrosa.f,gr=genrosa.g, method=c("Nelder-Mead"),
             itnmax=c(200), control=list(trace=0, save.failures=FALSE), gs=10)
cat("The answer should be NULL\n")
print(str(ans10))
cat("\n\n")

## End(Not run)

## Not run: 
startx<-4*seq(1:10)/3.
cat("Try getting hessian but not kkt tests -- no results!\n")
ans11<-optim(startx,fn=genrosa.f,gr=genrosa.g, hessian=TRUE,  
    control=list(all.methods=TRUE, trace=0, save.failures=FALSE, kkt=FALSE),
    gs=10)
print(ans11)
print(attr(ans11,"details")[7])
cat("\n\n")

## End(Not run)

## Not run: 
startx<-4*seq(1:10)/3.
cat("Use analytic hessian and no KKT tests\n")
ans12<-optim(startx,fn=genrosa.f,gr=genrosa.g, hess=genrosa.h, 
      control=list(trace=0, save.failures=FALSE, kkt=FALSE), gs=10)
print(ans12)
print(attr(ans12,"details"))

## End(Not run)

cat("Example with bounds and masks\n")
startx<-4*seq(1:10)/7.
lo<-rep(0.5,10)
up<-rep(8,10)
ans13<-optim(startx,fn=genrosa.f,gr=genrosa.g, lower=lo, upper=up,  
       control=list(trace=0), gs=4)
print(ans13)
cat("\n\n")



# ======= check names =========
cyq.f <- function (x) {
  rv<-cyq.res(x)
  f<-sum(rv*rv)
}

cyq.res <- function (x) {
# Fletcher's chebyquad function m = n -- residuals 
   n<-length(x)
   res<-rep(0,n) # initialize
   for (i in 1:n) { #loop over resids
     rr<-0.0
     for (k in 1:n) {
	z7<-1.0
	z2<-2.0*x[k]-1.0
        z8<-z2
        j<-1
        while (j<i) {
            z6<-z7
            z7<-z8
            z8<-2*z2*z7-z6 # recurrence to compute Chebyshev polynomial
            j<-j+1
        } # end recurrence loop
        rr<-rr+z8
      } # end loop on k
      rr<-rr/n
      if (2*trunc(i/2) == i) { rr <- rr + 1.0/(i*i - 1) }
      res[i]<-rr
    } # end loop on i
    res
}

cyq.jac<- function (x) {
#  Chebyquad Jacobian matrix
   n<-length(x)
   cj<-matrix(0.0, n, n)
   for (i in 1:n) { # loop over rows
     for (k in 1:n) { # loop over columns (parameters)
       z5<-0.0
       cj[i,k]<-2.0
       z8<-2.0*x[k]-1.0 
       z2<-z8
       z7<-1.0
       j<- 1
       while (j<i) { # recurrence loop
         z4<-z5
         z5<-cj[i,k]
         cj[i,k]<-4.0*z8+2.0*z2*z5-z4
         z6<-z7
         z7<-z8
         z8<-2.0*z2*z7-z6
         j<- j+1
       } # end recurrence loop
       cj[i,k]<-cj[i,k]/n
     } # end loop on k
   } # end loop on i
   cj
}


cyq.g <- function(x) {
   cj<-cyq.jac(x)
   rv<-cyq.res(x)
   # Need to escape the % in .Rd file
   gg<- as.vector(2.0* rv %*% cj)
}

## Not run:  # Use for copy and paste
cyq.g <- function(x) {
   cj<-cyq.jac(x)
   rv<-cyq.res(x)
   # Need to escape the % in .Rd file
   gg<- as.vector(2.0* rv 
}

## End(Not run)


startx<-c(0.2, 0.6)
names(startx)<-c("First","Second")
cat("check the names on the parameters\n")
print(startx)
ansnames<-optim(startx, cyq.f, cyq.g, method='L-BFGS-B')
ansnames
cat("check the names on the parameters\n")
print(ansnames$method)
print(ansnames$par)
cat("\n\n")

startx<-c(0.2, 0.4, 0.6, 0.8)
cat("check Hessian -- no bounds\n")
# should check bounds, control.kkt=FALSE, etc.
anshes<-optim(startx,cyq.f, cyq.g, hessian=TRUE, method='BFGS')
adet<-attr(anshes,"details")

optreplace documentation built on May 2, 2019, 4:45 p.m.

Related to optim in optreplace...