R/dotCall64.R
In dotCall64: Enhanced Foreign Function Interface Supporting Long Vectors

Documented in .C64

#' dotCall64 - Extended Foreign Function Interface
#' 
#' \code{.C64} can be used to call compiled and loaded C/C++ functions and Fortran subroutines.
#' \code{.C64} is similar to \code{\link{.C}} and \code{\link{.Fortran}}, but
#' \enumerate{
#'    \item supports long vectors, i.e., vectors with more than \code{2^31-1} elements
#'    \item does the necessary castings to expose the R representation of "64-bit integers" (numeric vectors)
#' to 64-bit integer arguments of the compiled function. The latter are int64_t in C code and integer (kind = 8) in Fortran code
#'    \item provides a mechanism the control duplication of the R objects exposed to the compiled code
#'    \item checks if the provided R objects are of the expected types and coerces them if necessary
#' }
#' Compared to \code{\link{.C}}, \code{.C64} has the additional arguments \code{SIGNATURE}, \code{INTENT} and \code{VERBOSE}.
#' \code{SIGNATURE} specifies the types of the arguments of the compiled function.
#' \code{INTENT} indicates whether the compiled function "reads", "writes",
#' or "read and writes" to the R objects passed to the compiled function.
#' This information is then used to duplicate R objects if and only if necessary.
#' 
#' @param .NAME character vector of length 1. Name of the compiled function to be called.
#' @param SIGNATURE character vector of the same length as the number of arguments of the compiled function.
#' Accepted strings are \code{"double"}, \code{"integer"}, and \code{"int64"}.
#' They describe the signature of each argument of the compiled function.
#' @param ... arguments passed to the compiled function. One R object for each argument. Up to 65 arguments are supported.
#' @param INTENT character vector of the same length as the number of arguments of the compiled function.
#' Accepted strings are \code{"rw"}, \code{"r"}, and \code{"w"}, which indicate
#' whether the intent of the argument is "read and write", "read", or "write", respectively.
#' If the INTENT of an argument is \code{"rw"}, the R object is copied and the
#' compiled function receives a pointer to that copy.
#' If the INTENT of an R object is \code{"r"}, the compiled
#' function receives a pointer to the R object itself.
#' While this avoids copying and hence is more efficient in terms of speed and memory usage,
#' it is absolutely necessary that the compiled function does not alter the object,
#' since this corrupts the R object in the current R session.
#' When the INTENT is \code{"w"}, the corresponding input argument can be specified
#' with the function \code{\link{vector_dc}} or its shortcuts \code{\link{integer_dc}} and \code{\link{numeric_dc}}.
#' This avoids copying the passed R objects and hence is more efficient in terms of speed and memory usage.
#' By default, all arguments have INTENT \code{"rw"}.
#' @param NAOK logical vector of length 1. If \code{FALSE} (default), the presence of \code{NA}, \code{NaN}, and \code{Inf}
#' in the R objects passed through \code{...} results in an error.
#' If \code{TRUE}, \code{NA}, \code{NaN}, and \code{Inf} values are passed to the compiled function.
#' The used time to check arguments (if \code{FALSE}) is considerable for large vectors. 
#' @param PACKAGE character vector of length 1. Specifies where to search for the function given in \code{.NAME}. 
#' This is intended to add safety for packages,
#' which can use this argument to ensure that no other package can override their external symbols,
#' and also speeds up the search.
#' @param VERBOSE numeric vector of length 1. If \code{0}, no warnings are printed.
#' If \code{1}, warnings are printed, which help to improve the performance of the call.
#' If \code{2}, additional debug information is given as warnings.
#' The default value can be changed via the \code{dotCall64.verbose} option, which is set to \code{0} by default. 
#'  
#' @return  list of R objects similar to the list of arguments specified as \code{...} arguments.
#' The objects of the list reflect the changes made by the compiled C or Fortran function.
#'  
#' @references
#' F. Gerber, K. Moesinger, R. Furrer (2018),
#' dotCall64: An R package providing an efficient interface to compiled C, C++, and Fortran code supporting long vectors,
#' SoftwareX 7, 217-221,
#' https://doi.org/10.1016/j.softx.2018.06.002.
#'  
#' F. Gerber, K. Moesinger, and R. Furrer (2017),
#' Extending R packages to support 64-bit compiled code: An illustration with spam64 and GIMMS NDVI3g data,
#' Computer & Geoscience 104, 109-119,
#' https://doi.org/10.1016/j.cageo.2016.11.015.
#'
#' @examples
#' ## Consider the following C function, which is included
#' ## in the dotCall64 package:  
#' ## void get_c(double *input, int *index, double *output) {
#' ##     output[0] = input[index[0] - 1];
#' ## }
#' ##
#' ## We can use .C64() to call get_c() from R:
#' .C64("get_c", SIGNATURE = c("double", "integer", "double"),
#'      input = 1:10, index = 9, output = double(1))$output
#'
#' \dontrun{
#' ## 'input' can be a long vector
#' x_long <- double(2^31) ## requires 16 GB RAM
#' x_long[9] <- 9; x_long[2^31] <- -1
#' .C64("get_c", SIGNATURE = c("double", "integer", "double"),
#'      input = x_long, index = 9, output = double(1))$output
#'
#' ## Since 'index' is of type 'signed int' (a 32-bit integer),
#' ## it can only address the first 2^31-1 elements of 'input'.
#' ## To also address elements beyond 2^31-1, we change the
#' ## definition of the C function as follows:
#' ## #include <stdint.h>  //  for int64_t 
#' ## void get64_c(double *input, int64_t *index, double *output) {
#' ##     output[0] = input[index[0] - 1];
#' ## }
#' 
#' ## Now, we can use .C64() to call get64_c() from R.
#' .C64("get64_c", SIGNATURE = c("double", "int64", "double"),
#'      input = x_long, index = 2^31, output = double(1))$output
#' ## Note that 2^31 is of type double and .C64() casts it into an
#' ## int64_t type before calling the C function get64_c().
#'
#' ## The performance of the previous call can be improved by
#' ## setting additional arguments:
#' .C64("get64_c", SIGNATURE = c("double", "int64", "double"),
#'      x = x_long, i = 2^31, r = numeric_dc(1), INTENT = c("r", "r", "w"),
#'      NAOK = TRUE, PACKAGE = "dotCall64", VERBOSE = 0)$r
#' 
#'
#' ## Consider the same function defined in Fortran:
#' ##      subroutine get64_f(input, index, output)
#' ##      double precision :: input(*), output(*)
#' ##      integer (kind = 8) :: index  ! specific to GFortran
#' ##      output(1) = input(index)
#' ##      end
#' 
#' ## The function is provided in dotCall64 and can be called with
#' .C64("get64_f", SIGNATURE = c("double", "int64", "double"),
#'      input = x_long, index = 2^31, output = double(1))$output
#' }
#' @useDynLib dotCall64, .registration = TRUE
#' @export
#' @name dotCall64
.C64 <- function(.NAME, SIGNATURE, ..., INTENT = NULL, NAOK = FALSE,
                 PACKAGE = "", VERBOSE = getOption("dotCall64.verbose")) {
  .External("dC64", name = .NAME, SIGNATURE = SIGNATURE, ..., INTENT = INTENT, NAOK = NAOK,
            f.PACKAGE = PACKAGE, VERBOSE = VERBOSE, PACKAGE = "dotCall64")
    }