R/tape_bdryw.R

Defines functions tape_bdryw

Documented in tape_bdryw

#' Generate a tape of a custom boundary weight function
#' @family tape builders
#' @param fileORcode A character string giving the path name of a file containing the unnormalised log-density definition *OR* code. `fileORcode` will be treated as a file name if `fileORcode` contains no new line characters ('\\n' or '\\r\\n') and has a file extension detected by [`tools::file_ext()`].
#' @param Cppopt List of named options passed to `Rcpp::sourceCpp()`
#' @param x Vector giving location in the manifold.
#' @description
#' Generate a tape of a boundary weight function, which is useful for specifying the boundary of manifolds in score matching.
#' Use `tape_bdryw()` to specify a custom function using `C++` code much like `TMB::compile()`.
#' Use `tape_bdryw_inbuilt()` for tapes of inbuilt functions implemented in this package.
#' @details
#' The function `tape_bdryw()` uses [`Rcpp::sourceCpp()`] to generate a tape of a function defined in C++. 
#'
#' The result result is NOT safe to save or pass to other CPUs in a parallel operation.
#' 
#' # Writing the `fileORcode` Argument
#' The code (possibly in the file pointed to by `fileORcode`) must be `C++` that uses only `CppAD` and `Eigen`, which makes it very similar to the requirements of the input to `TMB::compile()` (which also uses `CppAD` and `Eigen`).
#' 
#' The start of `code` should always be "`a1type fname(const veca1 &x){`" where `fname` is your chosen name of the function, `x` represents a point in the manifold. This specifies that the function will a single two vector argument (of type `veca1`) and will return a single numeric value (`a1type`).
#' 
#' The type `a1type` is a double with special ability for being taped by `CppAD`. The `veca1` type is a vector of `a1type` elements, with the vector wrapping supplied by the `Eigen` C++ package (that is an `Eigen` matrix with 1 column and dynamic number of rows).
#' In place operations like `+=` on `a1type`, `veca1` and similar have not worked for me (compile errors); fortunately the efficiency of in place operations is irrelevant for taping operations. 
#' 
#' The body of the function must use operations from Eigen and/or CppAD, prefixed by `Eigen::` and `CppAD::` respectively. 
#' There are no easy instructions for writing these as it is genuine `C++` code, which can be very opaque to those unfamiliar with `C++`.
#' However, recently ChatGPT and claude.ai have been able to very quickly translating `R` functions to `C++` functions (KLH has been telling these A.I. to use Eigen and CppAD, and giving the definitions of `a1type` and `veca1`).
#' I've found the quick reference pages for for [`Eigen`](https://libeigen.gitlab.io/eigen/docs-nightly/group__QuickRefPage.html) useful. Limited unary and binary operations are available directly from [`CppAD`](https://cppad.readthedocs.io) without `Eigen`. 
#'
#' Non-smooth functions should be used with care, but can be specified using `CppAD`'s CondExp functions.
#' @examples
#' \dontrun{
#' out <- tape_bdrw(
#' "a1type myminsq(const veca1 &x){
#' veca1 xsq = x.array().square();
#' a1type minval(0.1 * 0.1);
#' for(size_t i=0;i<x.size();i++){
#'   minval = CppAD::CondExpLe(xsq[i], minval, xsq[i], minval);
#' }
#' return(minval);
#' }", 
#'    rep(0.2, 5))
#' out$fun(c(0.01, 0.2, 0.2, 0.2, 0.2))
#' out$tape$forward(0, c(0.1, 0.2, 0.2, 0.2, 0.2))
#' out$tape$Jacobian(c(0.1, 0.2, 0.2, 0.2, 0.2))
#' out$tape$name
#' }
#' @return A list of three objects
#' + `fun` a function that evaluates the function directly
#' + `tape` a tape of the function
#' + `file` the temporary file storing the final source code passed to [`Rcpp::sourceCpp()`]
#' @importFrom RcppEigen fastLmPure
#' @export
tape_bdryw <- function(fileORcode = "", x, Cppopt = NULL){
  #read code
  filelines <- string_to_lines(fileORcode)
  if ((length(filelines) == 1) && (nchar(tools::file_ext(fileORcode)) > 0)){
    filelines <- readLines(fileORcode)
  }

  #remove empty first lines
  blank_lines <- grepl("^\\s*$", filelines)
  filelines <- filelines[1:length(filelines) >= min(which(!blank_lines))]

  # check provided code
  inputsloc <- regexpr("^[[:space:]]*a1type (?<fname>[^[:space:]]+)\\((?<arg1>[^\\)]+)", filelines[1], perl = TRUE)
  starts <- attr(inputsloc, "capture.start")
  lengths <- attr(inputsloc, "capture.length")
  if (lengths[, "fname"] < 1){stop("Could not find function name")}
  if (lengths[, "arg1"] < 1){stop("Could not find first argument")}
  arg1 <- substr(filelines[1], starts[, "arg1"], starts[, "arg1"] + lengths[, "arg1"] - 1)
  if (!grepl("^const +veca1 *&", arg1)){stop(sprintf("First argument should have type 'const veca1 &', instead it is %s", arg1))}
  fname <- substr(filelines[1], starts[, "fname"], starts[, "fname"] + lengths[, "fname"] - 1)
  if(fname == "tapebdryw"){stop("Please choose a name that is not 'tapebdryw' - tapebdryw is used internally")}


  # set up C++ code
  expandedfile <- tempfile(fileext = ".cpp")

  ## preamble
  cat(c(
"#include <scorematchingad_forward.h>",
"#include <Rcpp.h>",
"#include <scorematchingad.h>",
"// [[Rcpp::depends(RcppEigen)]]",
"// [[Rcpp::depends(scorematchingad)]]",
"// [[Rcpp::export]]"
), file = expandedfile, append = FALSE, sep = "\n")
  ## add in file
  cat(filelines, file = expandedfile, append = TRUE, sep = "\n")
  
  ## taping addendum
  cat(gsub("__FNAME__", fname, tapingboilerplate_bdryw), file = expandedfile, append = TRUE, sep = "\n")

  tryCatch({
  funs <- new.env()
  # compile
  compileout <- do.call(Rcpp::sourceCpp, c(list(file = expandedfile, env = funs), Cppopt))},
    error = function(e){stop("Could not sourceCpp() ", expandedfile, ".", e$message)})

  # execute exported tape-generating function
  tapeptr <- funs$tapebdryw(x)
  
  # return tape and function
  list(fun = funs[[compileout$functions[[1]]]],
       tape = tapeptr,
       file = expandedfile)
}

    
   
  ## boiler plate for taping
tapingboilerplate_bdryw <- c(
"// [[Rcpp::export]]",
"pADFun tapebdryw(veca1 & x){",
"  CppAD::ADFun<double> tape;",
"  CppAD::Independent(x);",
"  veca1 y(1);",
"  y(0) = __FNAME__(x);",
"  tape.Dependent(x, y);",
"  veca1 empty(0);",
"  pADFun out(tape, x, empty, \"__FNAME__\");",
"  return(out);",
"}"
)

bdrywnames <- c(
"prodsq",
"minsq",
"ones")

Try the scorematchingad package in your browser

Any scripts or data that you put into this service are public.

scorematchingad documentation built on Sept. 1, 2025, 9:08 a.m.