\section{Introduction}

The GNU Scientific Library, or \pkg{GSL}, is a collection of numerical routines for scientific computing \citep{GSL}. It is a rigourously developed and tested library providing support for a wide range of scientific or numerical tasks. Among the topics covered in the \pkg{GSL} are complex numbers, roots of polynomials, special functions, vector and matrix data structures, permutations, combinations, sorting, BLAS support, linear algebra, fast fourier transforms, eigensystems, random numbers, quadrature, random distributions, quasi-random sequences, Monte Carlo integration, N-tuples, differential equations, simulated annealing, numerical differentiation, interpolation, series acceleration, Chebyshev approximations, root-finding, discrete Hankel transforms least-squares fitting, minimization, physical constants, basis splines and wavelets.

Support for \proglang{C} programming with the \pkg{GSL} is available as the \pkg{GSL} itself is written in \proglang{C}, and provides a \proglang{C}-language Application Programming Interface (API). Access from \proglang{C++} is possible, albeit not at an abstraction level that could be offered by dedicated \proglang{C++} implementations. Several \proglang{C++} wrappers for the \pkg{GSL} have been written over the years; none reached a state of completion comparable to the \pkg{GSL} itself.

The \pkg{GSL} combines broad coverage of scientific topics, serious implementation effort, and the use of the well-known GNU General Public License (GPL). This has lead to fairly wide usage of the library. As a concrete example, we can consider the Comprehensive R Archive Network (CRAN) repository network for the \proglang{R} language and environment \citep{R:Main}. CRAN contains over three dozen packages interfacing the \pkg{GSL}. Of these more than half interface the vector or matrix classes as shown in Table \ref{tab:useOfGSLatCRAN}. This provides empirical evidence indicating that the \pkg{GSL} is popular among programmers using either the \proglang{C} or \proglang{C++} language for solving problems applied science.

\begin{table} \centering \begin{small} \begin{tabular}{lccc} \toprule Package & Any \texttt{gsl} header & \texttt{gsl_vector.h} & \texttt{gsl_matrix.h} \ \midrule abn & $\star$ & $\star$ & $\star$ \
BayesLogit & $\star$ & & \ BayesSAE & $\star$ & $\star$ & $\star$ \
BayesVarSel & $\star$ & $\star$ & $\star$ \ BH & $\star$ & $\star$ & \ bnpmr & $\star$ & & \ BNSP & $\star$ & $\star$ & $\star$ \ cghseg & $\star$ & $\star$ & $\star$ \ cit & $\star$ & & \ diversitree & $\star$ & & $\star$ \
eco & $\star$ & & \ geoCount & $\star$ & & \ graphscan & $\star$ & & \ gsl & $\star$ & $\star$ & \ gstat & $\star$ & & \ hgm & $\star$ & & \ HiCseg & $\star$ & & $\star$ \ igraph & $\star$ & & \ KFKSDS & $\star$ & $\star$ & $\star$ \ libamtrack & $\star$ & & \ mixcat & $\star$ & $\star$ & $\star$ \ mvabund & $\star$ & $\star$ & $\star$ \ outbreaker & $\star$ & $\star$ & $\star$ \ R2GUESS & $\star$ & $\star$ & $\star$ \ RCA & $\star$ & & \
RcppGSL & $\star$ & $\star$ & $\star$ \ RcppSMC & $\star$ & & \ RcppZiggurat & $\star$ & & \
RDieHarder & $\star$ & $\star$ & $\star$ \ ridge & $\star$ & $\star$ & $\star$ \ Rlibeemd & $\star$ & $\star$ & \
Runuran & $\star$ & & \ SemiCompRisks & $\star$ & & $\star$ \ simplexreg & $\star$ & $\star$ & $\star$ \ stsm & $\star$ & $\star$ & $\star$ \ survSNP & $\star$ & & \ TKF & $\star$ & $\star$ & $\star$ \ topicmodels & $\star$ & $\star$ & $\star$ \ VBLPCM & $\star$ & $\star$ & \ VBmix & $\star$ & $\star$ & $\star$ \ \bottomrule \end{tabular} \end{small} \caption{CRAN Package Usage of \pkg{GSL} outright, for vectors and for matrices.} \label{tab:useOfGSLatCRAN}

\begin{flushleft} \footnotesize \textsl{Note:} Data gathered in late July 2015 by use of \texttt{grep} searching (recursively) for inclusion of any GSL header, or the vector and matrix headers specifically, within the \texttt{src/} or \texttt{inst/include/} directories of expanded source code archives of the CRAN network. Convenient (temporary) shell access to such an expanded code archive via WU Vienna is gratefully acknowledged. \end{flushleft} \end{table}

At the same time, the \pkg{Rcpp} package \citep{JSS:Rcpp,Eddelbuettel:2013:Rcpp,CRAN:Rcpp,TAS:Rcpp} offers a higher-level interface between \proglang{R} and \proglang{C++}. \pkg{Rcpp} permits \proglang{R} objects like vectors, matrices, lists, functions, environments, $\ldots$, to be manipulated directly at the \proglang{C++} level, and alleviates the needs for complicated and error-prone parameter passing and memory allocation. It also allows compact vectorised expressions similar to what can be written in \proglang{R} directly at the \proglang{C++} level.

The \pkg{RcppGSL} package discussed here aims to close the gap. It offers access to \pkg{GSL} functions, in particular via the vector and matrix data structures used throughout the \pkg{GSL}, while staying closer to the `whole object model' familar to the \proglang{R} programmer.

The rest of paper is organised as follows. The next section shows a motivating example of a fast linear model fit routine using \pkg{GSL} functions. The following section discusses support for \pkg{GSL} vector types, which is followed by a section on matrices. The following two section discusses error handling, and then use of \pkg{RcppGSL} in your own package. This is followed by short discussions of how to use \pkg{RcppGSL} with \pkg{inline} and \textsl{Rcpp Attributes}, respectively, before a short concluding summary.

\section{Motivation: fastLm}

Fitting linear models is a key building block of analysing and modeling data. \proglang{R} has a very complete and feature-rich function in \texttt{lm()} which provides a model fit as well as a number of diagnostic measure, either directly or via the \texttt{summary()} method for linear model fits. The \texttt{lm.fit()} function provides a faster alternative (which is however recommend only for for advanced users) which provides estimates only and fewer statistics for inference. This may lead to user requests for a routine which is both fast and featureful enough. The \texttt{fastLm} routine shown here provides such an implementation as part of the \pkg{RcppGSL} package. It uses the \pkg{GSL} for the least-squares fitting functions and provides a nice example for \pkg{GSL} integration with \proglang{R}.

#include <RcppGSL.h>

#include <gsl/gsl_multifit.h>
#include <cmath>

// declare a dependency on the RcppGSL package; 
// also activates plugin (but not needed when 
// 'LinkingTo: RcppGSL' is used with a package)
//
// [[Rcpp::depends(RcppGSL)]]

// tell Rcpp to turn this into a callable 
// function called 'fastLm'
// 
// [[Rcpp::export]]
Rcpp::List fastLm(const RcppGSL::Matrix & X, 
                  const RcppGSL::Vector & y) {

    // row and column dimension
    int n = X.nrow(), k = X.ncol();
    double chisq;
    // to hold the coefficient vector
    RcppGSL::Vector coef(k); 
    // and the covariance matrix
    RcppGSL::Matrix cov(k,k);   


    // the actual fit requires working memory 
    // which we allocate and then free
    gsl_multifit_linear_workspace *work = 
        gsl_multifit_linear_alloc (n, k);
    gsl_multifit_linear (X, y, coef, cov, 
                         &chisq, work);
    gsl_multifit_linear_free (work);

    // assign diagonal to a vector, then take 
    // square roots to get std.error
    Rcpp::NumericVector std_err;
    // need two step decl. and assignment
    std_err = gsl_matrix_diagonal(cov); 
    // sqrt() is an Rcpp sugar function
    std_err = Rcpp::sqrt(std_err); 

    return Rcpp::List::create(
        Rcpp::Named("coefficients") = coef,
        Rcpp::Named("stderr")       = std_err,
        Rcpp::Named("df.residual")  = n - k);
}

The function interface defines two \pkg{RcppGSL} variables: a matrix and a vector. Both use the standard numeric type \texttt{double} as discussed below. The \pkg{GSL} supports other types ranging from lower precision floating point to signed and unsigned integers as well as complex numbers. The vector and matrix classes are templated for use with all these \proglang{C} / \proglang{C++} types---though \proglang{R} uses only \texttt{double} and \texttt{int}. For these latter two, we offer a shorthand definition via a \texttt{typedef} which allows a shorter non-template use. Having extracted the row and column dimentions, we then reserve another vector and matrix to hold the resulting coefficient estimates as well as the estimate of the covariance matrix. Next, we allocate workspace using a \pkg{GSL} routine, fit the linear model and free the just-allocated workspace. The next step involves extracting the diagonal element from the covariance matrix, and taking the square root (using a vectorised function from \pkg{Rcpp}). Finally we create a named list with the return values.

In earlier version of the \pkg{RcppGSL} package, we also explicitly called \texttt{free()} to return temporary memory allocation to the operating system. This step had to be done as the underlying objects are managed as \proglang{C} objects. They conform to the \pkg{GSL} interface, and work without any automatic memory management. But as we provide a \proglang{C++} data structure for matrix and vector objects, we can manage them using \proglang{C++} facilities. In particular, the destructor can free the memory when the object goes out of scope. Explicit \texttt{free()} calls are still permitted as we keep track the object status so that memory cannot accidentally be released more than once. Another more recent addition permits use of \texttt{const \&} in the interface. This instructs the compiler that values of the corresponding variable will not be altered, and are passed into the function by reference rather than by value.

We note that \pkg{RcppArmadillo} \citep{CRAN:RcppArmadillo,Eddelbuettel+Sanderson:2013:RcppArmadillo} implements a matching \texttt{fastLm} function using the Armadillo library by \cite{Sanderson:2010:Armadillo}, and can do so with even more compact code due to \proglang{C++} features. Moreover, \pkg{RcppEigen} \citep{CRAN:RcppEigen,JSS:RcppEigen} provides a \texttt{fastLm} implementation with a comprehensive comparison of matrix decomposition methods.

Vectors

This section details the different vector represenations, starting with their definition inside the \pkg{GSL}. We then discuss our layering before showing how the two types map. A discussion of read-only `vector view' classes concludes the section.

\pkg{GSL} Vectors

\pkg{GSL} defines various vector types to manipulate one-dimensionnal data, similar to \proglang{R} arrays. For example the \verb|gsl_vector| and \verb|gsl_vector_int| structs are defined as:

typedef struct{
    size_t size;
    size_t stride;
    double * data;
    gsl_block * block;
    int owner;
} gsl_vector;

typedef struct {
    size_t size;
    size_t stride;
    int * data;
    gsl_block_int * block;
    int owner;
} gsl_vector_int;

A typical use of the \verb|gsl_vector| struct is given below:

int i;
// allocate a gsl_vector of size 3
gsl_vector *v = gsl_vector_alloc(3);    

// fill the vector
for (i = 0; i < 3; i++) {        
    gsl_vector_set(v, i, 1.23 + i);
}

// access elements
double sum = 0.0;  
for (i = 0; i < 3; i++) {
    sum += gsl_vector_set(v, i);
}

// free the memory
gsl_vector_free(v); 

Note that we have to explicitly free the allocated memory at the end. With \proglang{C}-style programming, this step is always the responsibility of the programmer.

RcppGSL::vector

\pkg{RcppGSL} defines the template \texttt{RcppGSL::vector} to manipulate \verb|gsl_vector| pointers taking advantage of C++ templates. Using this template type, the previous example now becomes:

int i;
// allocate a gsl_vector of size 3
RcppGSL::vector<double> v(3);

// fill the vector
for (i = 0; i < 3; i++) {
    v[i] = 1.23 + i;
}

// access elements
double sum = 0.0;        
for (i = 0; i < 3; i++) {
    sum += v[i];
}

// (optionally) free the memory
// also automatic when out of scope
v.free();

The class \texttt{RcppGSL::vector} is a smart pointer which can be deployed anywhere where a raw pointer \verb|gsl_vector| can be used, such as the \verb|gsl_vector_set| and \verb|gsl_vector_get| functions above.

Beyond the convenience of a nicer syntax for allocation (and of course the managed release of memory either via \texttt{free()} or when going out of scope), the \texttt{RcppGSL::vector} template faciliates interchange of \pkg{GSL} vectors with \pkg{Rcpp} objects, and hence \pkg{R} objects. The following example defines a \texttt{.Call} compatible function called \verb|sum_gsl_vector_int| that operates on a \verb|gsl_vector_int| through the \texttt{RcppGSL::vector} template specialization:

// [[Rcpp::export]]
int sum_gsl_vector_int(const 
                       RcppGSL::vector<int> & 
                       vec) {
    int res = std::accumulate(vec.begin(), 
                              vec.end(), 0);
    return res;
}

Here we no longer need to call \texttt{free()} explicitly as the \texttt{vec} allocation is returned automatically at the end of the function body when the variable goes out of scope.

Once the function has created via \texttt{sourceCpp()} or \texttt{cppFunction()} from \textsl{Rcpp Attributes} (see section \ref{sec:attributes} for more on this), it can then be called from \proglang{R} :

fx <- Rcpp::cppFunction("
int sum_gsl_vector_int(RcppGSL::vector<int> vec) {
    int res = std::accumulate(vec.begin(),
                              vec.end(), 0);
    return res;
}", depends="RcppGSL")
sum_gsl_vector_int(1:10)

A second example shows a simple function that grabs elements of an R list as \verb|gsl_vector| objects using implicit conversion mechanisms of \pkg{Rcpp}

// [[Rcpp::export]]
double gsl_vector_sum_2(const Rcpp::List & data) {
    // grab "x" as a gsl_vector through the 
    // RcppGSL::vector<double> class
    const RcppGSL::vector<double> x = data["x"];

    // grab "y" as a gsl_vector through the 
    // RcppGSL::vector<int> class
    const RcppGSL::vector<int> y = data["y"];
    double res = 0.0;
    for (size_t i=0; i< x->size; i++) {
       res += x[i] * y[i];
    }

    // return result, memory freed automatically
    return res;    
}

called from \proglang{R}:

Rcpp::cppFunction("
double gsl_vector_sum_2(Rcpp::List data) {
    RcppGSL::vector<double> x = data[\"x\"];
    RcppGSL::vector<int> y = data[\"y\"];
    double res = 0.0;
    for (size_t i=0; i< x->size; i++) {
        res += x[i] * y[i];
    }
    return res;
}", depends= "RcppGSL")
data <- list( x = seq(0,1,length=10), y = 1:10 )
gsl_vector_sum_2(data)

Mapping

Table \ref{tab:mappingVectors} shows the mapping between types defined by the \pkg{GSL} and their corresponding types in the \pkg{RcppGSL} package.

\begin{table}[htb] \centering \begin{small} \begin{tabular}{ll} \toprule GSL vector & RcppGSL \ \midrule \texttt{gsl_vector} & \texttt{RcppGSL::vector} as well as \texttt{RcppGSL::Vector} \ \texttt{gsl_vector_int} & \texttt{RcppGSL::vector} as well as \texttt{RcppGSL::IntVector} \ \texttt{gsl_vector_float} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_long} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_char} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_complex} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_complex_float} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_complex_long_double} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_long_double} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_short} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_uchar} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_uint} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_ushort} & \texttt{RcppGSL::vector} \ \texttt{gsl_vector_ulong} & \texttt{RcppGSL::vector} \ \bottomrule \end{tabular} \end{small} \caption{Correspondance between \pkg{GSL} vector types and templates defined in \pkg{RcppGSL}.} \label{tab:mappingVectors} \end{table}

As shown, we also define two convenient shortcuts for the very common case of \texttt{double} and \texttt{int} vectors. First, \texttt{RcppGSL::Vector} is a short-hand for the \texttt{RcppGSL::vector} template instantiation. Second, \texttt{RcppGSL::IntVector} does the same for integer-valued vectors. Other types still require explicit templates.

Vector Views

Several \pkg{GSL} algorithms return \pkg{GSL} vector views as their result type. \pkg{RcppGSL} defines the template class \texttt{RcppGSL::vector_view} to handle vector views using \proglang{C++} syntax.

// [[Rcpp::export]]
Rcpp::List test_gsl_vector_view() {
    int n = 10;
    RcppGSL::vector<double> v(n);
    for (int i=0 ; i<n; i++) {
        v[i] = i;
    }
    const RcppGSL::vector_view<double> v_even =
      gsl_vector_subvector_with_stride(v,0,2,n/2);
    const RcppGSL::vector_view<double> v_odd  =
      gsl_vector_subvector_with_stride(v,1,2,n/2);

    return Rcpp::List::create(
        Rcpp::Named("even") = v_even,
        Rcpp::Named("odd" ) = v_odd);
}

As with vectors, \proglang{C++} objects of type \texttt{RcppGSL::vector_view} can be converted implicitly to their associated \pkg{GSL} view type. Table \ref{tab:mappingVectorViews} displays the pairwise correspondance so that the \proglang{C++} objects can be passed to compatible \pkg{GSL} algorithms.

\begin{table}[htb] \centering \begin{small} \begin{tabular}{ll} \toprule gsl vector views & RcppGSL \ \midrule \texttt{gsl_vector_view} & \texttt{RcppGSL::vector_view}; \texttt{RcppGSL::VectorView} \ \texttt{gsl_vector_view_int} & \texttt{RcppGSL::vector_view}; \texttt{RcppGSL::IntVectorView} \ \texttt{gsl_vector_view_float} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_long} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_char} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_complex} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_complex_float} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_complex_long_double} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_long_double} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_short} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_uchar} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_uint} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_ushort} & \texttt{RcppGSL::vector_view} \ \texttt{gsl_vector_view_ulong} & \texttt{RcppGSL::vector_view} \ \bottomrule \end{tabular} \end{small} \caption{Correspondance between \pkg{GSL} vector view types and templates defined in \pkg{RcppGSL}.} \label{tab:mappingVectorViews} \end{table}

The vector view class also contains a conversion operator to automatically transform the data of the view object to a \pkg{GSL} vector object. This enables use of vector views where \pkg{GSL} would expect a vector. And as before, \texttt{double} and \texttt{int} types can be accessed via the \texttt{typedef} variants \texttt{RcppGSL::VectorView} and \texttt{RcppGSL::IntVectorView}, respectively.

Lastly, in order to support \texttt{const \&} behaviour, all \texttt{gsl_vector_XXX_const_view} variants are also supported (where \texttt{XXX} stands for any of the atomistic \proglang{C} and \proglang{C++} data types).

Matrices

The \pkg{GSL} also defines a set of matrix data types : \texttt{gsl_matrix}, \texttt{gsl_matrix_int} etc ... for which \pkg{RcppGSL} defines a corresponding convenience \proglang{C++} wrapper generated by the \texttt{RcppGSL::matrix} template.

Creating matrices

The \texttt{RcppGSL::matrix} template exposes three constructors.

// convert an R matrix to a GSL matrix
matrix(SEXP x) 

// encapsulate a GSL matrix pointer
matrix(gsl_matrix* x)

// create a new matrix with the given 
// number of rows and columns
matrix(int nrow, int ncol)

Implicit conversion

\texttt{RcppGSL::matrix} defines an implicit conversion to a pointer to the associated \pkg{GSL} matrix type, as well as dereferencing operators. This makes the class \texttt{RcppGSL::matrix} look and feel like a pointer to a \pkg{GSL} matrix type.

gsltype* data;
operator gsltype*() { return data; }
gsltype* operator->() const { return data; }
gsltype& operator*() const { return *data; }

Indexing

Indexing of \pkg{GSL} matrices is usually the task of the functions \texttt{gsl_matrix_get}, \texttt{gsl_matrix_int_get}, ... and \texttt{gsl_matrix_set}, \texttt{gsl_matrix_int_set}, ...

\pkg{RcppGSL} takes advantage of both operator overloading and templates to make indexing a \pkg{GSL} matrix much more convenient.

// create a matrix of size 10x10
RcppGSL::matrix<int> mat(10,10);

// fill the diagonal, no need for setter function
for (int i=0; i<10: i++) {
    mat(i,i) = i;
}

Methods

The \texttt{RcppGSL::matrix} type also defines the following member functions: \begin{quote} \begin{itemize} \item[\texttt{nrow}] extracts the number of rows \item[\texttt{ncol}] extract the number of columns \item[\texttt{size}] extracts the number of elements \item[\texttt{free}] releases the memory (also called via destructor) \end{itemize} \end{quote}

Matrix views

Similar to the vector views discussed above, the \pkg{RcppGSL} also provides an implicit conversion operator which returns the underlying matrix stored in the matrix view class.

Error handler

When input values for \pkg{GSL} functions are invalid, the default error handler will abort the program after printing an error message. This leads \proglang{R} to an abortion error. To avoid this behaviour, one needs to avoid it first by using \texttt{gsl_set_error_handler_off()}, and then detect error conditions by checking whether the result is \texttt{NAN} or not.

// close the GSL error handler
gsl_set_error_handler_off();

// call GSL function with some invalid values
double res = gsl_sf_hyperg_2F1(1, 1, 1.1467003, 1);

// detect the result is NAN or not
if (ISNAN(res)) {
    Rcpp::Rcout << "Invalid input found!" 
                << std::endl;
}

See \url{http://thread.gmane.org/gmane.comp.lang.r.rcpp/7905} for a longer discussion of the related issues.

Starting with release 0.2.4, two new functions are available: \texttt{gslSetErrorHandlerOff()} and \texttt{gslResetErrorHandler()} which allow to turn off the error handler (as discussed above), and to reset to the prior (default) value. In addition, the package now also calls \texttt{gslSetErrorHandlerOff()} when being attached, ensuring that the \pkg{GSL} error handler is turned off by default.

Using \pkg{RcppGSL} in your package

The \pkg{RcppGSL} package contains a complete example package providing a single function \texttt{colNorm} which computes a norm for each column of a supplied matrix. This example adapts a matrix example from the \pkg{GSL} manual that has been chosen primarily as a means to showing how to set up a package to use \pkg{RcppGSL}.

Needless to say, we could compute such a matrix norm easily in \proglang{R} using existing facilities. One such possibility is a simple \verb|apply(M, 2, function(x) sqrt(sum(x^2)))| as shown on the corresponding help page in the example package inside \pkg{RcppGSL}. One point in favour of using the \pkg{GSL} code is that it employs a BLAS function so on sufficiently large matrices, and with suitable BLAS libraries installed, this variant could be faster due to the optimised code in high-performance BLAS libraries and/or the inherent parallelism a multi-core BLAS variant which compute compute the vector norm in parallel. On all `reasonable' matrix sizes, however, the performance difference should be neglible.

The \texttt{configure} script

Using autoconf

Using \pkg{RcppGSL} means employing both the \pkg{GSL} and \proglang{R}. We may need to find the location of the \pkg{GSL} headers and library, and this done easily from a \texttt{configure} source script which \texttt{autoconf} generates from a \texttt{configure.in} source file such as the following:

AC_INIT([RcppGSLExample], 0.1.0)

## Use gsl-config to find arguments for 
## compiler and linker flags
##
## Check for non-standard programs: gsl-config(1)
AC_PATH_PROG([GSL_CONFIG], [gsl-config])
## If gsl-config was found, let's use it
if test "${GSL_CONFIG}" != ""; then
    # Use gsl-config for header and linker args
    # (without BLAS which we get from R)
    GSL_CFLAGS=`${GSL_CONFIG} --cflags`
    GSL_LIBS=`${GSL_CONFIG} --libs-without-cblas`
else
    AC_MSG_ERROR([gsl-config not found, is 
                  GSL installed?])
fi

# Now substitute these variables in src/Makevars.in to create src/Makevars
AC_SUBST(GSL_CFLAGS)
AC_SUBST(GSL_LIBS)

AC_OUTPUT(src/Makevars)

A source file such as this \texttt{configure.in} gets converted into a script \texttt{configure} by invoking the \texttt{autoconf} program.

We note that many other libraries use a similar (but somewhat newer and by-now fairly standard) scripting frontend called \texttt{pkg-config} which be deployed in a very similar by other packages. Calls such as the following two can be used from \texttt{configure} in a very similar manner:

pkg-config --cflags libpng
pkg-config --libs libpng

where \texttt{libpng} (for the png image format) is used just for illustration.

Using functions provided by RcppGSL

\pkg{RcppGSL} provides R functions (in the file \texttt{R/inline.R}) that allow us to retrieve the same information. Therefore the configure script can also be written as:

#!/bin/sh

GSL_CFLAGS=`${R_HOME}/bin/Rscript -e \
    "RcppGSL:::CFlags()"`
GSL_LIBS=`${R_HOME}/bin/Rscript -e \
    "RcppGSL:::LdFlags()"`

sed -e "s|@GSL_LIBS@|${GSL_LIBS}|" \
    -e "s|@GSL_CFLAGS@|${GSL_CFLAGS}|" \
    src/Makevars.in > src/Makevars

Similarly, the configure.win for windows can be written as:

GSL_CFLAGS=`${R_HOME}/bin${R_ARCH_BIN}/Rscript.exe\
    -e "RcppGSL:::CFlags()"`
GSL_LIBS=`${R_HOME}/bin${R_ARCH_BIN}/Rscript.exe \
    -e "RcppGSL:::LdFlags()"`

sed -e "s|@GSL_LIBS@|${GSL_LIBS}|" \
    -e "s|@GSL_CFLAGS@|${GSL_CFLAGS}|" \
    src/Makevars.in > src/Makevars.win

This allows for a simpler and more direct way to just set the compile and link options, taking advantage of the installed \pkg{RcppGSL} package. See the \pkg{RcppZiggurat} package for an example.

The \texttt{src} directory

The \proglang{C++} source file takes the matrix supplied from \proglang{R} and applies the \pkg{GSL} function to each column.

#include <RcppGSL.h>
#include <gsl/gsl_matrix.h>
#include <gsl/gsl_blas.h>

// [[Rcpp::export]]
Rcpp::NumericVector 
colNorm(const RcppGSL::Matrix & G) {
    int k = G.ncol();
    Rcpp::NumericVector n(k);           // results
    for (int j = 0; j < k; j++) {
        RcppGSL::vector_view<double> colview = 
             gsl_matrix_const_column(G, j);
        n[j] = gsl_blas_dnrm2(colview);
    }
    return n;                           // return
}

The \proglang{Makevars.in} file governs the compilation and uses the values supplied by \texttt{configure} during build-time:

# set by configure
GSL_CFLAGS = @GSL_CFLAGS@
GSL_LIBS   = @GSL_LIBS@

# combine with standard arguments for R
PKG_CPPFLAGS = $(GSL_CFLAGS)
PKG_LIBS = $(GSL_LIBS)

The variables surrounded by \@ will be filled by \texttt{configure} during package build-time. As discussed above, this can either rely on \texttt{autoconf} or a possibly-simpler \texttt{Rscript}.

The \texttt{R} directory

The \proglang{R} source is very simply: it contains a single file created by the \texttt{Rcpp::compileAttributes()} function implementing the wrapper to the \texttt{colNorm()} function.

Input and Output

Because \pkg{RcppGSL} vectors are really C vectors, a little care is needed when using \proglang{C++} \pkg{iostream} output. Vector elements needs to be accessed explicitly via accessors as e.g. gsl_vector_get(x, 0) to display the initial element of a vector x.

Using \pkg{RcppGSL} with \pkg{inline}

The \pkg{inline} package \citep{CRAN:inline} is very helpful for prototyping code in \proglang{C}, \proglang{C++} or \proglang{Fortran} as it takes care of code compilation, linking and dynamic loading directly from \proglang{R}. It has been used extensively by \pkg{Rcpp}, for example in the numerous unit tests.

The example below shows how \pkg{inline} can be deployed with \pkg{RcppGSL}. We implement the same column norm example, but this time as an \proglang{R} script which is compiled, linked and loaded on-the-fly. Compared to standard use of \pkg{inline}, we have to make sure to add a short section declaring which header files from \pkg{GSL} we need to use; the \pkg{RcppGSL} then communicates with \pkg{inline} to tell it about the location and names of libraries used to build code against \pkg{GSL}.

require(inline)

inctxt='
   #include <gsl/gsl_matrix.h>
   #include <gsl/gsl_blas.h>
'

bodytxt='
  // create data structures from SEXP
  RcppGSL::matrix<double> M = sM;     
  int k = M.ncol();
  // to store results
  Rcpp::NumericVector n(k);   

  for (int j = 0; j < k; j++) {
    RcppGSL::vector_view<double> colview = 
        gsl_matrix_column (M, j);
    n[j] = gsl_blas_dnrm2(colview);
  }
  return n;
'

foo <- cxxfunction(
    signature(sM="numeric"),
    body=bodytxt, inc=inctxt, plugin="RcppGSL")

# see Section 8.4.13 of the GSL manual: 
# create M as a sum of two outer products
M <- outer(sin(0:9), rep(1,10), "*") + 
     outer(rep(1, 10), cos(0:9), "*")
foo(M)

The \texttt{RcppGSL} inline plugin supports creation of a package skeleton based on the inline function.

package.skeleton("mypackage", foo)

Using \pkg{RcppGSL} with Rcpp Attributes

\label{sec:attributes}

\textsl{Rcpp Attributes} \citep{CRAN:Rcpp:Attributes} builds on the features of the \pkg{inline} package described in previous section, and streamlines the compilation, loading and linking process even further. It leverages the existing plugins for \pkg{inline}. We already showed the corresponding function in the previous section. Here, we show it again as a self-contained example used via \texttt{sourceCpp()}. We stress that usage is \texttt{sourceCpp()} is meant for interactive work at the R command-prompt, but is not the recommended practice in a package.

#include <gsl/gsl_matrix.h>
#include <gsl/gsl_blas.h>

#include <RcppGSL.h>

// declare a dependency on the RcppGSL package; 
// also activates plugin
//
// [[Rcpp::depends(RcppGSL)]]

// declare the function to be 'exported' to R
//
// [[Rcpp::export]]
Rcpp::NumericVector 
colNorm(const RcppGSL::Matrix & M) {
    int k = M.ncol();
    Rcpp::NumericVector n(k);           // results

    for (int j = 0; j < k; j++) {
        RcppGSL::VectorView colview = 
            gsl_matrix_const_column (M, j);
        n[j] = gsl_blas_dnrm2(colview);
    }
    return n;                           // return
}

/*** R
## see Section 8.4.13 of the GSL manual:
## create M as a sum of two outer products
M <- outer(sin(0:9), rep(1,10), "*") +
     outer(rep(1, 10), cos(0:9), "*")
colNorm(M)
*/

With the code above stored in a file, say, ``gslNorm.cpp'' one can simply call \texttt{sourceCpp()} to have the wrapper code added, and all of the compilation, linking and loading done --- including the execution of the short \proglang{R} segment at the end:

sourceCpp("gslNorm.cpp")

The function \texttt{cppFunction()} is also available to convert a simple character string argument containing a valid C++ function into a eponymous R function. And like \texttt{sourceCpp()}, it can also use plugins. See the vignette ``Rcpp-attributes'' \citep{CRAN:Rcpp:Attributes} of the \pkg{Rcpp} package \citep{CRAN:Rcpp} for full details.

Summary

The GNU Scientific Library (GSL) by \citet{GSL} offers a very comprehensive collection of rigorously developed and tested functions for applied scientific computing under a widely-used and well-understood Open Source license. This has lead to widespread deployment of \pkg{GSL} among a number of disciplines.

Using the automatic wrapping and converters offered by the \pkg{RcppGSL} package presented here, \proglang{R} users and programmers can now deploy algorithmns provided by the \pkg{GSL} with greater ease.

\newpage



eddelbuettel/rcppgsl documentation built on March 27, 2024, 12:16 p.m.