rmarkdown::render("optimCheck-quicktut.Rmd") # R code to render vignette
\newcommand{\xx}{\boldsymbol{x}} \newcommand{\hxx}{\hat{\boldsymbol{x}}} \newcommand{\txx}{\tilde{\boldsymbol{x}}} \newcommand{\bb}{\boldsymbol{b}} \renewcommand{\AA}{\boldsymbol{A}}
The optimCheck
package provides a set of tools to check that output of an optimization algorithm is indeed at a local mode of the objective function. The tools include both visual and numerical checks, the latter serving to automate formalized unit tests with e.g., the R packages testthat
or RUnit
.
A brief overview of the package functionality is illustrated with the following example. Let
$$
Q(\xx) = \xx'\AA\xx - 2 \bb'\xx
$$
denote a quadratic objective function in $\xx \in \mathbb R^d$. If $\AA_{d \times d}$ is a positive-definite matrix, then the unique minimum of $Q(\xx)$ is $\hxx = \AA^{-1}\bb$. Let us now ignore this information and try to minimize $Q(\xx)$ using R's simplest built-in mode-finding routine, provided by the R function optim
.
In its simplest configuration, optim
requires only the objective function and a starting value $\xx_0$ to initialize the mode-finding procedure. Let's consider a difficult setting for optim
, with a relatively large $d = 12$ and a starting value $\xx_0$ which is far from the optimal value $\hxx$.
set.seed(2608) # set seed to fix output d <- 12 # dimension of optimization problem # create the objective function: Q(x) = x'Ax - 2b'x A <- crossprod(matrix(rnorm(d^2), d, d)) # positive definite matrix b <- rnorm(d) objfun <- function(x) crossprod(x, A %*% x)[1] - 2 * crossprod(b, x)[1] xhat <- solve(A, b) # analytic solution # numerical mode-finding using optim xfit <- optim(fn = objfun, # objective function par = xhat * 5, # initial value is far from the solution control = list(maxit = 1e5)) # very large max. number of iterations
optim_proj
Like most solvers, optim
utilizes various criteria to determine whether its algorithm has converged, which can be assess with the following command:
# any value other than 0 means optim failed to converge xfit$convergence
Here optim
reports that its algorithm has converged. Now let's check this visually with optimCheck
using projection plots. That is, let $\txx$ denote the potential optimum of $Q(\xx)$. Then for each $i = 1,\ldots,d$, we plot
$$
Q_i(x_i) = Q(x_i, \txx_{-i}), \qquad \txx_{-i} = (\tilde x_1, \ldots, \tilde x_{i-1}, \tilde x_{i+1}, \ldots, \tilde x_d).
$$
In other words, projection plot $i$ varies only $x_i$, while holding all other elements of $\xx$ fixed at the value of the potential solution $\txx$. These plots are produced with the optimCheck
function optim_proj
:
require(optimCheck) # load package # projection plots xnames <- parse(text = paste0("x[", 1:d, "]")) # variable names oproj <- optim_proj(fun = objfun, # objective function xsol = xfit$par, # potential solution maximize = FALSE, # indicates that a local minimum is sought xrng = .5, # range of projection plot: x_i +/- .5*|x_i| xnames = xnames)
In each of the projection plots, the potential solution $\tilde x_i$ is plotted in red. The xrng
argument to optim_proj
specifies the plotting range. Among various ways of doing this, perhaps the simplest is a single scalar value indicating that each plot should span $x_i \pm$ xrng
$\cdot |x_i|$. Thus we can see from these plots that optim
was sometimes up to 10% away from the local mode of the projection plots.
Projection plots are a powerful method of assessing the convergence of mode-finding routines to a local mode. While great for interactive testing, plots are not well-suited to automated unit testing as part of an R package development process. To this end, optimCheck
provides a means of quantifying the result of a call to optim_proj
. Indeed, a call to optim_proj
returns an object of class optproj
with the following elements:
sapply(oproj, function(x) dim(as.matrix(x)))
As described in the function documentation, xproj
and yproj
are matrices of which each column contains the $x$-axis and $y$-axis coordinates of the points contained in each projection plot. The summary
method for optproj
objects coverts these to absolute and relative errors in both the potential solution and the objective function. The print
method conveniently displays these results:
oproj # same print method as summary(oproj)
The documentation for summary.optproj
describes the various calculations it provides. Perhaps the most useful of these are the elementwise absolute and relative differences between the potential solution $\tilde{\xx}$ and $\hxx_\mathrm{proj}$, the vector of optimal 1D solutions in each projection plot. For convenience, these can be extracted with the diff
method:
diff(oproj) # equivalent to summary(oproj)$xdiff # here's exactly what these are xsol <- summary(oproj)$xsol # candidate solution xopt <- summary(oproj)$xopt # optimal solution in each projection plot xdiff <- cbind(abs = xopt-xsol, rel = (xopt-xsol)/abs(xsol)) range(xdiff - diff(oproj))
Thus it is proposed that a combination of summary
and diff
methods for projection plots can be useful for constructing automated unit tests. In this case, plotting itself can be disabled by passing optim_proj
the argument plot = FALSE
. See the optimCheck/tests
folder for testthat
examples featuring:
glm
function).rq
function in quantreg
)emEEE
in mclust
).set.seed(151) # run tests and output timings tresults <- testthat::test_package("optimCheck", reporter = "list") invisible(sapply(tresults, function(tres) { message("Context: ", tres$context, " [", length(tres$results), " tests -- ", round(tres$real, 1), " s]") }))
optim_refit
: A Numerical Alternative to Projection PlotsThere are some situations in which numerical quantification of projection plots leaves to be desired:
Generating all projection plots requires N = 2 * npts * length(xsol)
evaluations of the objective function (where the default value is npts = 100
), which can belabor the process of automated unit testing. A different test for mode-finding routines is to recalculate the optimal solution with an "very good" starting point: the current potential solution. This is the so-called "refine optizimation" -- or refit
-- strategy.
The optim_refit
function refines the optimization with a call to R's built-in general-purpose optimizer: the function optim
. In particular, it selects the default Nelder-Mead simplex method with a simplified parameter interface. As seen in the unit tests above, the refit
checks are 2-3 times faster than their projection plot counterparts. Consider now the example of refining the original optim
solution to the quadratic objective function:
orefit <- optim_refit(fun = objfun, # objective function xsol = xfit$par, # potential solution maximize = FALSE) # indicates that a local minimum is sought summary(orefit) # same print method as orefit
Thus we can see that the first and second run of optim
are quite different.
Of course, this does not mean that the refit solution produced by optim
is a local mode:
# projection plots with refined solution optim_proj(xsol = orefit$xopt, fun = objfun, xrng = .5, maximize = FALSE)
Indeed, the default optim
method is only accurate when initialized close to the optimal solution. Therefore, one may wish to run the refit test with a different optimizer. This can be done externally to optim_refit
, prior to passing the refit solution to the function via its argument xopt
. This is illustrated below using optim
's gradient-based quasi-Newton method:
# gradient of the objective function objgrad <- function(x) 2 * drop(A %*% x - b) # mode-finding using quasi-Newton method xfit2 <- optim(fn = objfun, # objective function gr = objgrad, # gradient par = xfit$par, # initial value (first optim fit) method = "BFGS") # external refit test with optimizer of choice orefit2 <- optim_refit(fun = objfun, xsol = xfit$par, # initial value (first optim fit) xopt = xfit2$par, # refit value (2nd fit with quasi-Newton method maximize = FALSE) # project plot test on refit solution optim_proj(xsol = orefit2$xopt, fun = objfun, xrng = .5, maximize = FALSE, plot = FALSE)
Many constrained statistical optimization problems, seek a "sparse" solution, i.e., one for which some of the elements of the optimal solution are equal to zero. In such cases, the relative difference between potential and optimal solution is an unreliable metric. A working proposal is to flag these "true zeros" in optim_proj
and optim_refit
, so as to add a 1 to the relative difference denominators. Other suggestions on this and optimCheck
in general are welcome.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.