The goal of armacmp
is to create a DSL to formulate linear algebra
code in R that is compiled to C++ using the Armadillo Template Library.
It also offers an mathematical optimization that uses RcppEnsmallen
to
optimize functions in C++.
The scope of the package is linear algebra and Armadillo. It is not meant to evolve into a general purpose R to C++ transpiler.
It has three main functions:
compile
compiles an R function to C++ and makes that function
again avaliable in your R session.translate
translates an R function to C++ and returns the code as
text.compile_optimization_problem
uses RcppEnsmallen
and the
functions above to compile continuous mathematical optimizations
problems to C++.This is currently an experimental prototype with most certainly bugs or unexpected behaviour. However I would be happy for any type of feedback, alpha testers, feature requests and potential use cases.
Potential use cases:
Rcpp
speedup gain for linear algebra codetranslate
and use the code as a starting point for further
development.optimize
remotes::install_github("dirkschumacher/armacmp")
You can compile R like code to C++. Not all R functions are supported.
library(armacmp)
Takes a matrix and returns its transpose.
trans <- compile(function(X) {
return(t(X))
})
trans(matrix(1:10))
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1 2 3 4 5 6 7 8 9 10
Or a slightly larger example using QR decomposition
# from Arnold, T., Kane, M., & Lewis, B. W. (2019). A Computational Approach to Statistical Learning. CRC Press.
lm_cpp <- compile(function(X, y = type_colvec()) {
qr_res <- qr(X)
qty <- t(qr.Q(qr_res)) %*% y
beta_hat <- backsolve(qr.R(qr_res), qty)
return(beta_hat, type = type_colvec())
})
# example from the R docs of lm.fit
n <- 70000 ; p <- 20
X <- matrix(rnorm(n * p), n, p)
y <- rnorm(n)
all.equal(
as.numeric(coef(lm.fit(X, y))),
as.numeric(lm_cpp(X, y))
)
#> [1] TRUE
armacmp
always compiles functions. Every function needs to have a
return
statement with an optional type argument.
my_fun <- compile(function(X, y = type_colvec())) {
return(X %*% y, type = type_colvec())
}
A lot of linear algebra functions/operators are defined as well some control flow (for loops and if/else). Please take a look at the function reference article for more details what can be expressed.
ensmallen
The package now also supports optimization of functions using
RcppEnsmallen
. Find out more at
ensmallen.org.
All code is compiled to C++. During the optimization there is no context switch back to R.
Here we minimize 2 * norm(x)^2
using simulated annealing.
# taken from the docs of ensmallen.org
optimize <- compile_optimization_problem(
data = list(),
evaluate = function(x) {
return(2 * norm(x)^2)
},
optimizer = optimizer_SA()
)
# should be roughly 0
optimize(matrix(c(1, -1, 1), ncol = 1))
#> [,1]
#> [1,] 0.001071887
#> [2,] -0.001426598
#> [3,] 0.001272070
Optimizers:
optimizer_SA
optimizer_CNE
Here solve a linear regression problem using L-BFGS.
optimize_lbfgs <- compile_optimization_problem(
data = list(design_matrix = type_matrix(), response = type_colvec()),
evaluate = function(beta) {
return(norm(response - design_matrix %*% beta)^2)
},
gradient = function(beta) {
return(-2 %*% t(design_matrix) %*% (response - design_matrix %*% beta))
},
optimizer = optimizer_L_BFGS()
)
# this example is taken from the RcppEnsmallen package
# https://github.com/coatless/rcppensmallen/blob/master/src/example-linear-regression-lbfgs.cpp
n <- 1e6
beta <- c(-2, 1.5, 3, 8.2, 6.6)
p <- length(beta)
X <- cbind(1, matrix(rnorm(n), ncol = p - 1))
y <- X %*% beta + rnorm(n / (p - 1))
# Run optimization with lbfgs fullly in C++
optimize_lbfgs(
design_matrix = X,
response = y,
beta = matrix(runif(p), ncol = 1)
)
#> [,1]
#> [1,] -1.999974
#> [2,] 1.502354
#> [3,] 3.002081
#> [4,] 8.199424
#> [5,] 6.597857
Optimizers:
optimizer_L_BFGS
optimizer_GradientDescent
armacmp
improve performance?It really depends on the use-case and your code. In general Armadillo
can combine linear algebra operations. For example the addition of 4
matrices A + B + C + D
can be done in a single for loop. Armadillo can
detect that and generates efficient code.
So whenever you combine many different operations, armacmp
might be
helpful in speeding things up.
We gather some examples on the wiki to further explore if compiling linear algebra code to C++ actually makes sense for pure speed reasons.
armacmp
is experimental and has a volatile codebase. The best way to
contribute is to write issues/report bugs/propose features and test the
package with your specific use-case.
Please note that the ‘armacmp’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.