numvbderiv_parallel: Economy numerical derivatives

numvbderiv_parallelR Documentation

Economy numerical derivatives

Description

numvbderiv does simple two-point symmetric numerical differentiation of any function WRTO a vector parameter, via (f(x+delta)-f(x-delta))/(2*delta). Your function can return a vector/matrix/array of real or complex type, and if the x parameter is not scalar, then the result has one extra dimension at the end for the per-parameter-element derivatives.

For multi-parameter (length(x)>=4 or so) derivatives of slow functions, you can speed things up a lot with parallel processing, by setting PARALLEL=TRUE or (better) by directly calling numvbderiv_parallel. But, be aware there is substantial learning-curve-pain-cost to all this parallel shenanigans in R. numvbderiv_parallel uses the foreach package to diff wrto each component x[i] in parallel, using however many cores at a time you tell it to. You have to set up a "parallel cluster" beforehand in R. See Examples— it took me a long time to get this working, but now it's good.

numvbderiv is definitely "economy model" and for many many years I have kept it out of mvbutils, because it is not particularly accurate nor incredibly robust, and I didn't want to have to deal with people's questions! But I use numvbderiv and numvbderiv_parallel all the time in code that I want to share (sometimes with different names, omitting the "mv"), and in 2024 it just became too annoying to have to distribute them separately. So here they are, with nice new names, and tarted-up documentation that you are now enjoying, but still warts and all.

Faq

  • Q: Surely there are well-known methods to produce more accurate and robust numerical derivatives?

  • A: Yep.

  • Q: I want something more!

  • A: Then use something else!

  • Q: Oh well. But I guess numvbderiv is easy to use, right?

  • A: Yep.

As to accuracy: IME numvbderiv is usually fine, and computationally cheap! The relevant parameter is eps; to compute Df(x)/dx|x=x0 your function f is evaluated at x0+/-eps*x0 (unless x0==0 exactly, in which case it is at +/-eps). The bigger you go with eps, the less mathematically accurate the result, since the neglected higher-order terms are bigger; but if you go too small, then the answer becomes computationally inaccurate because of rounding etc. The default is crude but has usually worked OK for me, given this is not a high-accuracy routine. I sometimes play around with values between 1e-3 and 1e-7. If you're worried, try two different values that differ by an order-of-mag.

Unlike eg the numDeriv package or pracma::numderiv, which use more function evaluations at various step-sizes to account for higher-order terms in the finite-difference approximation, numvbderiv does not try to be very accurate, and you do have to specify the step yourself (see Arguments) or trust the default. Nevertheless, I expect my numvbderiv to be more accurate than the original (or still-current default) of stats::numericDeriv because the latter appears not to do symmetric calculation, based on the code in "Writing R Extensions" section 5.11. IE, it just does (f(x0+e)-f(x0))/e. [Update in 2024: AFAIK stats::numericDeriv used never to even have a symmetric option, but it now appears to have added one now via its central argument— although that defaults to FALSE :/ .] Also, stats::numericDeriv is pretty horrible to use TBH; AFAIK its main historical purpose was just to show how to interface C to R, not to actually differentiate stuff!

Usage

numvbderiv( f, x0, eps=0.0001, param.name=NULL, ...,
  SIMPLIFY=TRUE, PARALLEL=FALSE,
  TWICE.TO.GET.SENSIBLE.ANSWERS=TRUE)
numvbderiv_parallel(f, x0, eps = 0.0001, param.name = NULL, ...,
    SIMPLIFY = TRUE,  PARALLEL = TRUE,
    PROGRESS_BAR = interactive() && .Platform$OS.type!='unix',
    PROGRESS_BAR_FILE = "",  FOREACH_ARGS = list())

Arguments

f

function of one or more arguments

x0

value to numdiff around

eps

Relative step-size. Evaluation is at x0+/-eps*x0 unless x0==0 exactly, in which case it is at +/-eps. See Faq.

param.name

Unless the parameter you want to diff WRTO comes first in the argument-list of f, you need to specify its name, eg param.name="c" if your function is f(a,b,c) and you wanna diff wirto the third one.

...

Other args that your f wants.

SIMPLIFY

If TRUE and f appears to return a "scalar-equivalent" result (eg all-but-one of its dimensions are of extent 1, as you can sometimes get eg from a matrix-multiply I guess if you use R's built-in routine), then this will turn the result into a pure vector. Avoids you getting tedious N*1 or 1*N "matrix" results that you then have to c() yourself.

PARALLEL

if FALSE, use the scalar version. If TRUE and length(x0)>1 and the foreach package is available and there is a "currently registered doPar backend" [sic], then parallel woop-woop magic will be used. numvbderiv/numvbderiv_parallel have defaults PARALLEL=FALSE/TRUE respectively.

FOREACH_ARGS

things to pass to foreach::foreach, eg .packages or perhaps .exports so your function can find stuff it needs when it is invoked in a new cold lonely R session.

PROGRESS_BAR

If you are bothering to use the parallel version, then presumably things are fairly slow; you can set PROGRESS_BAR=TRUE to see how it's going. I don't know if it works on Linux, coz it relies on flush.console, so the default there is FALSE, but you can give it a try.)

PROGRESS_BAR_FILE

I use numvbderiv_parallel during interactive R sessions in RGui, and the default of appearing in the console seems ideal. For other uses, you might need to tell the progress bar to appear somewhere else, via this argument which is passed as the file argument of txtProgressBar.

TWICE.TO.GET.SENSIBLE.ANSWERS

Leave it alone!!! Not for you.

Details

The progress bar

The progress bar (parallel case only) uses a txtProgressBar and some excellent Github code from K Vasilopoulos. It's no good trying to get your own function to show its progress or call-count in the parallel case, because it will be executing in separate invisible R processes and messages don't get sent back, so this is the only convenient way. However, the nature of foreach means that this progress bar is only updated when a task finishes, and since all deriv-steps will take about the same time, you'll probably get the first 4 finishing all-at-once, so that progress will update in a very clunky fashion and if your parameter is of low dimension, the bar may not help. The numvbderiv_parallel code actually does try to update the progress-bar before the paralleling begins, immediately after the very first function call which is to f(x0) itself, so in principle you should "quickly" get some idea of how long it's all gonna take— but that update doesn't always seem to show up. Displaying the bar relies on a call to utils::flush.console (qv) so prolly doesn't work under Unix; maybe there's another way. Future versions of numvbderivParallel may let you supply your own progress-bar rather than forcing txtProgressBar. For now, be grateful for what you have been given.

Value

Normally, an array/matrix with same dimensions as f(x0) except for an extra one at the end, of length(x0). If SIMPLIFY=TRUE (see Arguments) and a pure vector "makes sense", then the dimensions will be stripped and you'll get a pure vector.

See Also

pracma::numderiv; the numDeriv package; stats::numericDeriv, the foreach package, the doParallel package

Examples

# Complex numbers are OK:
numvbderiv( function( x) x*x, complex( real=1, imaginary=3))
# [1] 2+6i
# Parallel example...  the whole point is to show speed and generality
# Works fine on my machine
# But if testing under CRAN, which I normally never do,
# then CRAN's ludicrous 2-core limit, and deliberate inability to
# check CRANality (or even number of cores _allowed_) while running,
# makes this completely ridiculous
# Not for the first time
# I have used the function 'get_ncores_CRANal' to try to get round this...
if( require( 'doParallel')){ # auto loads foreach, iterators, parallel thx2 "Depends"
  ncores <- detectCores( logical=FALSE)
  scatn( '%i cores really found', ncores)
  if( ncores > 2 ){ # pointless otherwise
    # Need a slowish example. 1e5 is too small; 1e7 better,
    # ... but hard on auto builders eg R-universe
    BIGGOVAL <- 1e5
    slowfun <- function( pars, BIGGO)
      sum( sqr( 1+1/outer( seq_len( BIGGO), pars)))
    parstart <- rep( 2, 8)
    system.time(
      dscalar <- numvbderiv( slowfun, parstart,
          BIGGO=BIGGOVAL # named extra param (part of ...)
        )
    ) # scalar
    # Make "doPar back end". I do not know what I am doing ...
    # NB I like to leave some cores spare, hence "-1"--
    # superstition, really
    ncores_target <- min( ncores-1, length( parstart))
    # Anti CRANky: ignore on your own machine:
    # ncores_target should just work
    ncores_avail <- get_ncores_CRANal( ncores_target)
    scatn( 'Using %i cores eg cozza CRAN', ncores_avail)
    CLUSTO <- makeCluster( ncores_avail)
    registerDoParallel( CLUSTO, ncores_avail)
    # Next bit ensures slaves can find packages... sigh.
    # Necessary _here_ coz example, but you may not need it
    # clusterCall does not work properly :/, so the "obvious" fails:
    # clusterCall( CLUSTO, .libPaths, .libPaths())
    # Instead, we are forced into this nonsense:
    print( # for debuggery with as-CRAN
    eval( substitute(
        clusterEvalQ( CLUSTO, .libPaths( lb)),
        list( lb=.libPaths())))
    )
    # Need 'mvbutils::sqr', hence '.packages' arg
    scatn( 'Starting parallel time test')
    print( system.time(
      dpara <- numvbderiv_parallel( slowfun, parstart,
          BIGGO=BIGGOVAL, # named extra parameter
          FOREACH_ARGS=list( .packages= 'mvbutils')
        )
      )
    )
    scatn( 'Done')
    print( rbind( dscalar, dpara))
    # To refer to other data (ie beside params)
    # best practice is to put it into function's environment
    # (generally true, not just for numvbderiv)
    e <- new.env()
    e$paroffset <- c( 6, -3)
    fun2 <- function( pars) { # not a speed test, can be smaller
        sum( sqr( 1+1/outer( 1:1e3, pars+paroffset)))
      }
    environment( fun2) <- e
    scatn( 'Scalar, using extra data via environment')
    print( numvbderiv( fun2, parstart))
    # Parallel version should still work, coz function's environment
    # is also passed to slaves
    scatn( 'Trying parallel version...')
    print( try({
      numvbderiv_parallel( fun2, parstart,
          FOREACH_ARGS=list( .packages= 'mvbutils')
        )
      })
    )
    # Sometimes you do need to explicitly export stuff to the slave processes
    # Here's a version that will get paroffset from datenv
    # datenv must exist...
    alt_fun2 <- function( pars){
      environment( fun2) <- list2env( datenv)
      fun2( pars)
    }
    scatn( 'With explicit data (in parallel)')
    datenv <- as.list( e)
    print( numvbderiv_parallel( alt_fun2, parstart,
        FOREACH_ARGS=list(
          .packages= 'mvbutils',
          .export= cq( datenv, fun2) # stuff that alt_fun2 refers to
          )
      )
    )
    # Always tidy up your clusters once you have finished playing
    stopImplicitCluster()
    stopCluster( CLUSTO)
    rm( CLUSTO)
  } # if ncores>2
} # parallel

mvbutils documentation built on May 25, 2026, 5:09 p.m.