The goal of sfcc is to provide fast and flexible creation of sf
simple features geometries.
Originally for sfc
vectors of POINT
, directly motivated by the set-based idioms of silicate, and the need to bypass the thorough checking done by the sf constructors.
The design for the code is this:
For example
points_cpp
compiles to an internal R functionpoints_rcpp
R function to ensure sanity of inputsmk_sfc_POINT
R function to organize the inputs and apply the sfc scaffoldingFor MULTIPOLYGON
multipolygons_cpp
compiles to an internal R functionmultipolygons_rcpp
R functionmk_sfc_MULTIPOLYGON
What might also work is to have a single POLYGON constructor, that optionally applies the sfg level classing. Then MULTIPOLYGON wrapper can call that for each of its sub objects, with "no class please" and then apply its own class at the right level. But, these things are pretty simple so a bit of duplication is also ok.
It's not finalize what the best way to specify grouping is, but it's driven from silicate. (There's hopefully a flexible set of different ways to do this, including aes(x, y, group = ..., subobject = ...)
and other basics like earcut.hpp, silicate objects, other types).
For now we don't consider GEOMETRY (sfc vectors with mixed topology).
I wouldn't use this as-is, yet. :)
This example was motivated by this issue, trying to speed up sf creation for POINTs.
The improvement is fairly modest, so there might be a better Rcpp way to do this?
N <- 1e5
x <- data.frame(a = sample(letters, N, replace = TRUE),
lng = runif(N, -120, -100),
lat = runif(N, 30, 48))
## we avoid st_sf so we can compare apples (just the sfc creation)
sf_mk_points <- function(x, coords) {
classdim = sf:::getClassDim(rep(0, length(coords)), length(coords), dim, "POINT")
structure( lapply(split(as.vector(t(as.matrix(x[, coords]))),
rep(seq_len(nrow(x)), each = length(coords))),
## it does help to create this as a template and update in this loop
## but not as much as list-creation in C++
function(vec) structure(vec, class = classdim)),
n_empty = 0L, precision = 0, crs = NA_crs_,
bbox = structure(
c(xmin = min(x[[coords[1]]], na.rm = TRUE),
ymin = min(x[[coords[2]]], na.rm = TRUE),
xmax = max(x[[coords[1]]], na.rm = TRUE),
ymax = max(x[[coords[2]]], na.rm = TRUE)), class = "bbox"),
class = c("sfc_POINT", "sfc" ))
}
m <- cbind(x$lng, x$lat)
library(rbenchmark)
library(sf)
#> Linking to GEOS 3.6.2, GDAL 2.2.4, proj.4 4.9.3
benchmark(
sfcc_pt = sfcc::mk_sfc_POINT(m, crs = 4326),
sf_pt = sf_mk_points(x, coords = c("lng", "lat"))
,replications = 10
)
#> test replications elapsed relative user.self sys.self user.child
#> 2 sf_pt 10 16.844 17.787 16.815 0.016 0
#> 1 sfcc_pt 10 0.947 1.000 0.935 0.012 0
#> sys.child
#> 2 0
#> 1 0
A polygon example (motivated by spex::polygonize
)
qm <- quadmesh::quadmesh(raster::raster(volcano), z = NULL)
## a dummy structure to copy
template <- structure(list(cbind(1:5, 0)),
class = c("XY", "POLYGON", "sfg"))
nquads <- ncol(qm$ib)
idx <- rep(seq(0, nquads), each = 5L) + c(1L, 2L, 3L, 4L, 1L)
quadgroups <- rep(seq_len(nquads), each = 5)
idx <- rbind(qm$ib, qm$ib[1,])
xylist <- split(c(t(qm$vb[1:2, idx])), rep(quadgroups, 2L))
## slower code
system.time({
l <- lapply(seq_along(xylist), function(ii) {
template[[1L]][] <- xylist[[ii]]
template
})
})
#> user system elapsed
#> 0.051 0.000 0.051
## Rcpp code is a bit faster
library(sfcc)
## we were working with a flat vector
xylistm <- lapply(xylist, function(v) matrix(v, ncol = 2))
system.time(p <- polygons_rcpp(xylistm))
#> user system elapsed
#> 0.012 0.000 0.012
identical(l, p)
#> [1] TRUE
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.