knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(a5R)
a5R can parallelise vectorised operations using multiple threads via rayon. By default a5R uses a single thread, so there is zero overhead. You opt in to parallelism when you need it.
# Check the current setting (default: 1) a5_get_threads() # Use 4 threads a5_set_threads(4) a5_get_threads()
a5_set_threads(1)
You can also set threads at package load time via an R option or environment variable - useful for scripts and batch jobs:
# In .Rprofile or at the top of a script options(a5R.threads = 4) # Or as an environment variable # Sys.setenv(A5R_NUM_THREADS = 4)
a5_set_threads() invisibly returns the previous value, making temporary
changes easy:
old <- a5_set_threads(4) # ... parallel work ... a5_set_threads(old)
Threading applies to vectorised functions that process each element independently:
| Function | Per-element cost | Benefit |
|---|---|---|
| a5_cell_to_boundary() | Heavy (boundary + WKT/WKB) | High |
| a5_grid() | Heavy (boundary filtering) | High |
| a5_lonlat_to_cell() | Moderate (projection) | High |
| a5_cell_distance() | Moderate (2x projection + distance) | Medium |
| a5_cell_to_lonlat() | Moderate (reverse projection) | Medium |
| a5_cell_to_parent() | Light (bit ops + hex) | Low |
| a5_get_resolution() | Light (bit ops) | Low |
| a5_is_valid() | Light (hex parse) | Low |
Scalar and bulk operations (a5_cell_to_children(), a5_compact(),
a5_cell_area(), etc.) are unaffected --- they are already fast or delegate
to algorithms that don't parallelise element-wise.
Threading has a small fixed overhead (thread synchronisation, memory allocation for intermediate results). For small vectors this can outweigh the benefit. As a rule of thumb:
Here's a quick comparison on 100k cells:
cells <- a5_grid(c(-10, 50, 10, 60), resolution = 12) length(cells) #> [1] 704259 a5_set_threads(1) system.time(a5_cell_to_boundary(cells, format = "wkt")) #> user system elapsed #> 3.124 0.000 3.122 a5_set_threads(8) system.time(a5_cell_to_boundary(cells, format = "wkt")) #> user system elapsed #> 6.195 1.289 1.667
Note that user time increases (total CPU work across all threads) while
elapsed (wall-clock) time decreases --- that's the parallelism at work.
a5R uses a dedicated rayon thread pool, separate from R's own
parallelism. It is safe to use alongside future, mirai, etc. but think
carefully about this nested parallelism as it can, if overloaded, degrade
performance.
The thread pool is rebuilt each time you call a5_set_threads(), so
changing the count mid-session is fine (and cheap) but not free - ideally, just
do it once at the start of your workflow rather than toggling per-call.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.