maxControl: Class '"MaxControl"'
In maxLik: Maximum Likelihood Estimation and Related Tools

MaxControl-class

R Documentation

Class `"MaxControl"`

Description

This is the structure that holds the optimization control options. The corresponding constructors take the parameters, perform consistency checks, and return the control structure. Alternatively, it overwrites the supplied parameters in an existing MaxControl structure. There is also a method to extract the control structure from the estimated ‘maxim’-objects.

Slots

The default values and definition of the slots:

tol: 1e-8, stopping condition for maxNR and related optimizers. Stop if the absolute difference between successive iterations is less than tol, returns code 2.
reltol: sqrt(.Machine$double.eps), relative convergence tolerance (used by maxNR related optimizers, and optim-based optimizers. The algorithm stops if it iteration increases the value by less than a factor of reltol*(abs(val) + reltol). Returns code 2.
gradtol: 1e-6, stopping condition for maxNR and related optimizers. Stops if norm of the gradient is less than gradtol, returns code 1.
steptol: 1e-10, stopping/error condition for maxNR and related optimizers. If qac == "stephalving" and the quadratic approximation leads to a worse, instead of a better value, or to NA, the step length is halved and a new attempt is made. If necessary, this procedure is repeated until step < steptol, thereafter code 3 is returned.
lambdatol: 1e-6, (for maxNR related optimizers) controls whether Hessian is treated as negative definite. If the largest of the eigenvalues of the Hessian is larger than -lambdatol (Hessian is not negative definite), a suitable diagonal matrix is subtracted from the Hessian (quadratic hill-climbing) in order to enforce negative definiteness.
qac: "stephalving", character, Qadratic Approximation Correction for maxNR related optimizers. When the new guess is worse than the initial one, program attempts to correct it: "stephalving" decreases the step but keeps the direction. "marquardt" uses Marquardt (1963) method by decreasing the step length while also moving closer to the pure gradient direction. It may be faster and more robust choice in areas where quadratic approximation behaves poorly.
qrtol: 1e-10, QR-decomposition tolerance for Hessian inversion in maxNR related optimizers.
marquardt_lambda0: 0.01, a positive numeric, initial correction term for Marquardt (1963) correction in maxNR-related optimizers
marquardt_lambdaStep: 2, how much the Marquardt (1963) correction is decreased/increased at successful/unsuccesful step for maxNR related optimizers
marquardt_maxLambda: 1e12, maximum allowed correction term for maxNR related optimizers. If exceeded, the algorithm exits with return code 3.
nm_alpha: 1, Nelder-Mead simplex method reflection factor (see Nelder & Mead, 1965)
nm_beta: 0.5, Nelder-Mead contraction factor
nm_gamma: 2, Nelder-Mead expansion factor
sann_cand: NULL or a function for "SANN" algorithm to generate a new candidate point; if NULL, Gaussian Markov kernel is used (see argument gr of optim).
sann_temp: 10, starting temperature for the “SANN” cooling schedule. See optim.
sann_tmax: 10, number of function evaluations at each temperature for the “SANN” optimizer. See optim.
sann_randomSeed: 123, integer to seed random numbers to ensure replicability of “SANN” optimization and preserve R random numbers. Use options like SANN_randomSeed=Sys.time() or SANN_randomeSeed=sample(1000,1) if you want stochastic results.

General options for stochastic gradient methods:

SG_learningRate: 0.1, learning rate, numeric
SG_batchSize: NULL, batch size for Stochastic Gradient Ascent. A positive integer, or NULL for full-batch gradent ascent.
SG_clip: NULL, gradient clipping threshold. This is the max allowed squared Euclidean norm of the gradient. If the actual norm of the gradient exceeds (square root of) this threshold, the gradient will be scaled back accordingly while preserving its direction. NULL means no clipping.
SG_patience: NULL, or integer. Stopping condition: if the objective function is worse than its largest value so far this many times, the algorithm stops, and returns not the last parameter value but the one that gave the best results so far. This is mostly useful if gradient is computed on training data and the objective function on validation data.
SG_patienceStep: 1L, integer. After how many epochs to check the patience value. 1 means to check (and hence to compute the objective function) at each epoch.

Options for SGA:

SGA_momentum: 0, numeric momentum parameter for SGA. Must lie in interval [0,1].

Options for Adam:

Adam_momentum1: 0.9, numeric in [0,1], the first moment momentum
Adam_momentum2: 0.999, numeric in [0,1], the second moment momentum

General options:

iterlim: 150, stopping condition (the default differs for different methods). Stop if more than iterlim iterations performed. Note that ‘iteration’ may mean different things for different optimizers.
max.rows: 20, maximum number of matrix rows to be printed when requesting verbosity in the optimizers.
max.cols: 7, maximum number of columns to be printed. This also applies to vectors that are printed horizontally.
printLevel: 0, the level of verbosity. Larger values print more information. Result depends on the optimizer. Form print.level is also accepted by the methods for compatibility.
storeParameters: FALSE, whether to store and return the parameter values at each epoch. If TRUE, the stored values can be retrieved with storedParameters-method. The parameters are stored as a matrix with rows corresponding to the epochs and columns to the parameter components.
storeValues: FALSE, whether to store and return the objective function values at each epoch. If TRUE, the stored values can be retrieved with storedValues-method.

Methods

maxControl: (...) creates a “MaxControl” object. The arguments must be in the form option1 = value1, option2 = value2, .... The options should be slot names, but the method also supports selected other parameter forms for compatibility reasons e.g. “print.level” instead of “printLevel”. In case there are more than one option with similar name, the last one overwrites the previous values. This allows the user to override default parameters in the control list. See example in maxLik-package.
maxControl: (x = "MaxControl", ...) overwrites parameters of an existing “MaxControl” object. The ‘...’ argument must be in the form option1 = value1, option2 = value2, .... In case there are more than one option with similar name, only the last one is taken into account. This allows the user to override default parameters in the control list. See example in maxLik-package.
maxControl: (x = "maxim") extracts “MaxControl” structure from an estimated model
show: shows the parameter values

Details

Typically, the control options are supplied in the form of a list, in which case the corresponding default values are overwritten by the user-specified ones. However, one may also create the control structure by maxControl(opt1=value1, opt2=value2, ...) and supply such value directly to the optimizer. In this case the optimization routine takes all the values from the control object.

Note

Several control parameters can also be supplied directly to the optimization routines.

Author(s)

Ott Toomet

References

Nelder, J. A. & Mead, R. A (1965) Simplex Method for Function Minimization The Computer Journal 7, 308–313
Marquardt, D. W. (1963) An Algorithm for Least-Squares Estimation of Nonlinear Parameters Journal of the Society for Industrial and Applied Mathematics 11, 431–441

Examples

library(maxLik)
## Create a 'maxControl' object:
maxControl(tol=1e-4, sann_tmax=7, printLevel=2)

## Optimize quadratic form t(D) %*% W %*% D with p.d. weight matrix,
## s.t. constraints sum(D) = 1
quadForm <- function(D) {
   return(-t(D) %*% W %*% D)
}
eps <- 0.1
W <- diag(3) + matrix(runif(9), 3, 3)*eps
D <- rep(1/3, 3)
                        # initial values
## create control object and use it for optimization
co <- maxControl(printLevel=2, qac="marquardt", marquardt_lambda0=1)
res <- maxNR(quadForm, start=D, control=co)
print(summary(res))
## Now perform the same with no trace information
co <- maxControl(co, printLevel=0)
res <- maxNR(quadForm, start=D, control=co) # no tracing information
print(summary(res))  # should be the same as above
maxControl(res) # shows the control structure

maxLik documentation built on May 29, 2024, 2:32 a.m.