knitr::opts_chunk$set(echo = TRUE)
Whatever system you use, you need to install Julia.
Go to https://julialang.org/downloads and choose the correct download,
and install as appropriate.
Note that autodiffr
mainly works with Julia v0.6.x, and only partly works with Julia v0.7 currently (as Julia v0.7 is in beta).
After having a correct Julia installation,
autodiffr
can be installed by:
devtools::install_github("Non-Contradiction/autodiffr")
It's also recommended to install the latest version of JuliaCall
by:
devtools::install_github("Non-Contradiction/JuliaCall")
And after the installation is finished, just do the following steps, note that the first time it will take some time, but the next time it should be better.
library(autodiffr) ad_setup()
autodiffr
And we can use package numDeriv
to compare with autodiffr
.
require(numDeriv)
ForwardDiff
and ReverseDiff
In this section, we are using a function similar to Rosenbrock function as an initial example:
## Later on, I may export these functions maybe in an R environment including all testing functions. ## So users can get easy access to these functions for examples, testing and benchmarking purposes. ## But currently these functions are unexported in autodiffr rosen1 <- autodiffr:::rosenbrock_1 # And there are other functions, which can be used to replace the rosen1 function in the following examples like # rosen2 <- autodiffr:::rosenbrock_2 # rosen4 <- autodiffr:::rosenbrock_4 # ackley <- autodiffr:::ackley rosen1 # rosen2 # rosen4 # ackley target <- rosen1
These functions receive a vector input and generates scalar output.
So forward_grad
and reverse_grad
can be used to calculate their gradients and forward_hessian
and reverse_hessian
can be used to calculate their hessians. Besides, it should be possible to calculate the jacobian of the gradient function, which should be the same with the hessian.
The basic usage is as follows:
X <- runif(5) X forward_grad(target, X) reverse_grad(target, X) forward_jacobian(function(x) forward_grad(target, x), X) reverse_jacobian(function(x) reverse_grad(target, x), X) forward_hessian(target, X) reverse_hessian(target, X)
And we can compare the results with numDeriv
:
numDeriv::grad(target, X) numDeriv::jacobian(function(x) numDeriv::grad(target, x), X) numDeriv::hessian(target, X)
We can see the result is similar, and results from autodiffr
should be more accurate.
grosen <- autodiffr::makeGradFunc(target) grosen grosen(X) ## use in optimization library(optimx) x0<-c(-1.2, 1) target(x0) grosen(x0) srosen0 <- optimr(x0, target, grosen, method="Rvmmin", control=list(trace=1)) srosen0
If gradient or hessian of the same target function need to be caculated repeatedly on inputs of the same shape, like input vectors of same length or input matrices with the same number of rows and columns, then users can caculate config objects before to improve the performance:
cfg <- forward_grad_config(target, X) forward_grad(target, X, cfg = cfg) cfg <- reverse_grad_config(X) reverse_grad(target, X, cfg = cfg) cfg <- forward_hessian_config(target, X) forward_hessian(target, X, cfg = cfg) cfg <- reverse_hessian_config(X) reverse_hessian(target, X, cfg = cfg)
And users can also choose a chunk size when using config objects for forward mode automatic differentiation, choosing appropriate sample size can improve performance further.
cfg1 <- forward_grad_config(target, X, chunk_size = 1) forward_grad(target, X, cfg = cfg1) cfg1 <- forward_hessian_config(target, X, chunk_size = 1) forward_hessian(target, X, cfg = cfg1)
In reverse mode automatic differentiation, if the target function are defined without any branching, like if
, ifelse
, while
and etc, then it is possible to use tape
to further improve performance:
tp <- reverse_grad_tape(target, X) reverse_grad(tp, X) tp <- reverse_hessian_tape(target, X) reverse_hessian(tp, X)
autodiffr
autodiffr
provides a set of user interface functions to deal with the usage of forward and reverse mode automatic differentiation.
If users want to caculate the gradient, hessian or jacobian directly:
autodiffr::ad_grad(target, X, mode = "forward") autodiffr::ad_grad(target, X, mode = "reverse") autodiffr::ad_hessian(target, X, mode = "forward") autodiffr::ad_hessian(target, X, mode = "reverse")
It's easy to have a gradient, hessian, or jacobian function:
autodiffr::makeGradFunc(target, mode = "forward")(X) autodiffr::makeGradFunc(target, mode = "reverse")(X) autodiffr::makeHessianFunc(target, mode = "forward")(X) autodiffr::makeHessianFunc(target, mode = "reverse")(X)
If users want to have a gradient, hessian, or jacobian function specalised for inputs of the fixed shape (like length of the input vector or numbers of rows and columns of the input matrix),
then it is possible to have an optimized function by providing x
argument:
g1 <- autodiffr::makeGradFunc(target, mode = "forward", x = runif(length(X))) g1(X) g2 <- autodiffr::makeGradFunc(target, mode = "reverse", x = runif(length(X))) g2(X) h1 <- autodiffr::makeHessianFunc(target, mode = "forward", x = runif(length(X))) h1(X) h2 <- autodiffr::makeHessianFunc(target, mode = "reverse", x = runif(length(X))) h2(X)
Beyond the x
argument, users can also provide chunk_size
for forward mode automatic differentiation.
From ?ForwardDiff
users can see instrustions on how to choose chunk size.
And users can use use_tape
for reverse mode automatic differentiation, which may increase the performance of gradient, hessian or jacobian functions a lot.
g1 <- autodiffr::makeGradFunc(target, mode = "forward", x = runif(length(X)), chunk_size = 1) g1(X) g2 <- autodiffr::makeGradFunc(target, mode = "reverse", x = runif(length(X)), use_tape = TRUE) g2(X) h1 <- autodiffr::makeHessianFunc(target, mode = "forward", x = runif(length(X)), chunk_size = 1) h1(X) h2 <- autodiffr::makeHessianFunc(target, mode = "reverse", x = runif(length(X)), use_tape = TRUE) h2(X)
But please note that if the target function are defined with branching, like if
, ifelse
, while
and etc, then use_tape
may give incorrect results, like the following example.
f <- function(x){ if (x[1] > 1.5) { sum(x^2) } else { sum(x^3) } } gF <- autodiffr::makeGradFunc(f, mode = "forward", x = rep(2,3), chunk_size = 1) gR <- autodiffr::makeGradFunc(f, mode = "reverse", x = rep(2,3), use_tape = TRUE) gF(rep(1,3)) ## correct result gR(rep(1,3)) ## wrong result
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.