simdjson by Daniel Lemire (with contributions by Geoff Langdale, John Keiser and many others) is an engineering marvel. Through very clever use of SIMD instructions, it manages to parse JSON files faster than disc access. Wut? Yes you read that right: parallel processing with so little overhead that the net throughput is limited only by disk speed.
Moreover, it is implemented in neat modern C++ and can be accessed as a header-only library. (Well, one library in two files, really.) Which makes R packaging easy and convenient and compelling. So here we are.
jsonfile <- system.file("jsonexamples", "twitter.json", package="RcppSimdJson") library(RcppSimdJson) validateJSON(jsonfile) # validate a JSON file res <- fload(jsonfile) # parse a JSON file
A simple parsing benchmark against four other R-accessible JSON parsers:
R> res Unit: milliseconds expr min lq mean median uq max neval cld simdjson 1.87118 2.03252 2.24351 2.17228 2.27756 6.57145 100 a jsonify 8.91694 9.20124 9.58652 9.46077 9.73692 13.41707 100 b RJSONIO 10.49187 11.09410 11.69109 11.42555 11.95780 17.93653 100 b ndjson 27.04830 28.62251 31.44330 29.51343 32.05847 146.88221 100 c jsonlite 34.93334 36.54784 38.67843 37.74890 40.19555 46.32444 100 d R>
Or in chart form:
As of version 0.1.0, all three major OSs are supported, and JSON can be parsed from file and string under a variety of settings. Prefers a real C++17 compiler but can fall back to older compiler. Still subject to change.
Any problems, bug reports, or features requests for the package can be submitted and handled most conveniently as Github issues in the repository.
For standard JSON work on R, as well as for other nicely done C++ libraries, consider these:
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.