Writing C++ and integrating it with your R packages can be a tremendous boost to speed, as well as allowing a useR to integrate with lots of interesting C and C++ libraries developed by other people. C++ and R are very different languages, however, with very different conventions, particularly around strings. In R, most string operations are performed on entire vectors of strings and consist of a single function call. In C++, most string operations require very precise manipulation and only apply to one string at a time - some of them only apply to one character of a string at a time!

rope is intended to be at least a partial solution to this, by providing simple, R-style functions for common string manipulation operations and converting between types. To dive straight into the functionality, see the other two vignettes:

  1. String Manipulation with rope
  2. Type Checking and Conversion with rope

Motivation and design philosophy

rope is designed to make this particular corner of C++ look a lot like R - which is pretty horrifiying to C++ programmers, but oh well. It has two target audiences. The first are R programmers who are just starting out with C++ and are getting used to having to do things at a very low level and spend a lot of their time frantically googling how to convert between different types.

The second are maybe more experienced R/C++ programmers who have got their head around both of these things, but still find themselves spending an annoying amount of time either (a) remembering how to do this one specific thing to strings or (b) carrying an implementation of that thing around with them, from package to package, because it isn't provided by the base language.

In both cases, rope hopefully provides convenient implementations of lots of common string-based operations, from find-and-replace to string splitting to pasting. All of them come in two forms - one operating on single elements, and one operating on entire vectors of elements.

Structure and functionality

rope can be included via an #include call to a single header file, <rope/rope.h>; alternately you can also include specific sub-units that provide particular bits of functionality. These are:

  1. <rope/case.h>, which allows you to upper- or lower-case strings;
  2. <rope/paste.h>, which provides convenient paste-like functionality in C++;
  3. <rope/remove.h>, which allows you to remove particular substrings from a string;
  4. <rope/replace.h>, which lets you replace particular substrings;
  5. <rope/reverse.h>, to reverse strings, and;
  6. <rope/split.h>, which lets you easily split and tokenise strings.
  7. <rope/check.h>, which allows you to check if a string is numeric, alphabetic, alphanumeric or hexadecimal;
  8. <rope/convert.h>, which allows you to convert a string into various other types (or other types into strings);

1 through to 6 are documented in the "String Manipulation with rope" vignette; 7 and 8 in "Type Checking and Conversion with rope".

Using rope

rope takes advantage of the Rcpp dependency concept, which allows you to easily link installed C++ modules into your code. Suppose we wanted a function that would split a string and count how many elements there were once it was split. If you wanted to do that as a standalone piece of C++, rather than as part of a package, you'd write a count.cpp file that looked something like:

```{Rcpp eval = FALSE} //[[Rcpp::depends(rope)]]

include

//[[Rcpp::export]] int count_elements(std::string to_split, delimiter){ std::vector < std::string > holding = string_split(to_split, delimiter); return holding.size(); } ```

This would then create an exported C++ function, accessible from R, that takes two arguments - to_split and delimiter - and returns the number of elements of the string to split after it's been split at each delimiter.

If you wanted to include this functionality in a package, it would look exactly the same, but rope would be added to the LinkingTo entry in your DESCRIPTION file.

Testing rope

rope's functions are not exported, but do have R-side wrappers for testing purposes; these can be called using ::: notation if you'd like to quickly experiment with the package and see what sort of output you can expect. The names can be deduced by looking at ./src/rope.cpp or the unit tests, but follow a common convention: they have the same name as the underlying, included C++, but with the addition of _single or _vector on the end to distinguish the ones designed for single elements from the ones designed for entire vectors.

So, if we wanted to play around with string\_split in R when testing or designing our code, and intended to use it to split single strings, we could call rope:::string\_split\_single(). If we wanted a vectorised version, rope:::string\_split\_vector().



PeteHaitch/rope documentation built on May 8, 2019, 1:32 a.m.