knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) Sys.setenv(LANG = "en_US.UTF-8") options(rmarkdown.html_vignette.check_title = FALSE)
sf::st_as_sf(data.table::rbindllist(<list_of_sf>))
is a
fast alternative
to do.call(rbind, <list_of_sf>)
for binding list
s of simple features objects
(sf
) to single sf
. However, there are some pitfalls when rbindlist()
is
applied to list
s of sf
. These are explored, and some workarounds are
presented.
library(dplyr) library(data.table) library(sf)
rbindlist()
and do.call(rbind, )
bbox
We create a list
of sf
objects:
nc <- st_read(system.file("gpkg/nc.gpkg", package = "sf"), quiet = TRUE) list_of_sf <- lapply(seq_len(nrow(nc)), function(x) nc[x, ])
Binding such a list
of sf
with do.call(rbind, )
to a single sf
object
works easily (as long as the list
is rather small):
nc_rbind <- do.call(rbind, list_of_sf)
And will return the expected object of the class sf
:
all.equal(nc_rbind, nc)
What's different when rbindlist()
is used instead of do.call(rbind, )
?
First of all, rbindlist()
on its own does not bind a list
of sf
to an
object of the class sf
:
rbindlist(list_of_sf) %>% class()
Thus, subsequently the return of rbindlist()
needs to be converted into an
object of the the class sf
:
nc_rbindlist <- rbindlist(list_of_sf) %>% st_as_sf()
Is the return of the pipe rbindlist() %>% st_as_sf()
equivalent to the one of
do.call(rbind, )
?
all.equal(nc_rbindlist, nc)
No, not really! The pipe-output has inherited the class data.table
:
class(nc) class(nc_rbindlist)
Will omitting the class data.table
eliminate the difference?
class(nc_rbindlist) <- c("sf", "data.frame") all.equal(nc_rbindlist, nc)
There's also an issue with bbox
of the geometry column:
st_bbox(nc_rbindlist) == st_bbox(nc)
Instead of returning the correct bbox
for the whole geometry column,
data.table::rbindlist()
simply copies the bbox
of the first list
ed
sf
object:
st_bbox(nc_rbindlist) == st_bbox(list_of_sf[[1]])
A simple trick solves
this issue inherent to sf
-objects compiled with data.table::rbindlist()
:
sf_former_dt <- sf_former_dt[1:nrow(sf_former_dt), ] # or, preferable in a programming context: sf_former_dt <- sf_former_dt[seq_len(nrow(sf_former_dt)), ]
We apply the trick ...
nc_rbindlist <- nc_rbindlist[seq_len(nrow(nc_rbindlist)), ]
... check if it worked:
all.equal(nc_rbindlist, nc)
list
s containing sf
-objects with different CRS are treated quite differently
by do.call(rbind, )
and rbindlist() %>% st_as_sf()
. To demonstrate that, we
create such a list
:
nc <- st_read(system.file("gpkg/nc.gpkg", package = "sf"), quiet = TRUE) nc_3857 <- st_transform(nc, 3857) list_of_sf_with_unequal_crs <- list(nc, nc_3857)
rbind()
does not bind sf
-objects with unequal CRS and throws a corresponding
error-message:
do.call(rbind, list_of_sf_with_unequal_crs)
Whereas rbindlist()
doesn't seem to be affected:
sf_dt <- list_of_sf_with_unequal_crs %>% rbindlist() %>% st_as_sf()
The returned object of the classes sf
and data.table
has a CRS which is the
same as the one of first list
ed sf
-object:
st_crs(sf_dt)$epsg st_crs(sf_dt) == st_crs(nc)
If we reverse the sequence of elements of the list
of sf
-objects with
unequal CRS the newly first placed sf
-object dictates the CRS of the by
rbindlist() %>% st_as_sf
bound single sf
-object:
sf_dt <- list_of_sf_with_unequal_crs %>% rev() %>% rbindlist() %>% st_as_sf() st_crs(sf_dt)$epsg st_crs(sf_dt) == st_crs(nc_3857)
The maloperation of rbindlist()
regarding list
s of sf
-objects with
different CRS can't be handled after the fact. Thus, feeding such a list
to
rbindlist()
must be avoided. Hence prior checking if a list
of sf
contains
more than 1 CRS is crucial:
list(nc, nc_3857) %>% lapply(st_crs) %>% n_distinct()
geometry_type
sA list
with sf
-objects having geometry columns of different geometry_type
s:
l_diff_geom_type <- list(nc[c(17, 56), ], st_cast(nc[4, ], "POLYGON", warn = FALSE)) sapply(l_diff_geom_type, st_geometry_type, by_geometry = FALSE) %>% unique()
rbind()
can easily stack sf
-objects with geometry columns of different
geometry_type
s and the returned geometry_type
is GEOMETRY
:
sf_do.call <- do.call(rbind, l_diff_geom_type) st_geometry_type(sf_do.call, by_geometry = FALSE)
But rbindlist()
can't handle a list
containing different geometry_type
s:
rbindlist(l_diff_geom_type)
In order to make such a list
workable for rbindlist()
, the geometry_type
s
have to be homogenized:
l_homogenized_geom_type <- lapply(l_diff_geom_type, st_cast, to = "GEOMETRY", warn = FALSE) sapply(l_homogenized_geom_type, st_geometry_type, by_geometry = FALSE) %>% unique()
Once the list
ed sf
-objects have the same geometry_type
, rbindlist()
can
stack them:
sf_rbindlist <- rbindlist(l_homogenized_geom_type) %>% st_as_sf() st_geometry_type(sf_rbindlist, by_geometry = FALSE)
Note that the above demonstrated way of homogenizing
geometry_type
s isn't very time efficient for large list
s of sf
-objects.
Luckily there's a faster alternative:
l <- l_diff_geom_type # for the sake of code readability, copy list as short-named object for (i in seq_along(l)) { class(l[[i]][[attr(l[[i]], "sf_column")]])[[1]] <- "sfc_GEOMETRY" }
We check if the two homogenizing methods are equivalent:
all.equal(l_homogenized_geom_type, l)
all.equal( sf_rbindlist, st_as_sf(rbindlist(l)) )
sf_rbindlist <- sf_rbindlist[1:nrow(sf_rbindlist), ] class(sf_rbindlist) <- c("sf", "data.frame") row.names(sf_rbindlist) <- row.names(sf_do.call) all.equal(sf_rbindlist, sf_do.call)
The use of data.table::rbindlist()
as a fast alternative to do.call(rbind, )
for binding list
s of sf
-objects to a single sf
-object is justified in
situations when do.call(rbind, )
is too slow. It does, however, require
appropriate pre- and post-treatment:
list
of sf
-objects contains more than 1 distinct CRS:list_of_sf %>% lapply(st_crs) %>% n_distinct()
If so, CRS-unification needs to be done before moving on.
?st_transform ?st_crs
list
ed sf
-objects have different geometry_type
s:sapply(list_of_sf, st_geometry_type, by_geometry = FALSE) %>% unique()
If so, homogenize them:
list_of_sf <- lapply(list_of_sf, st_cast, to = "GEOMETRY", warn = FALSE) # or more advisable because faster (for large lists): for (i in seq_along(list_of_sf)) { class(list_of_sf[[i]][[attr(list_of_sf[[i]], "sf_column")]])[[1]] <- "sfc_GEOMETRY" }
list
of sf
-objects to a single sf
-object (which is also a
data.table
):list_of_sf %>% rbindlist() %>% st_as_sf() -> sf_object
bbox
:sf_object <- sf_object[1:nrow(sf_object), ] # or, preferable in a programming context: sf_object <- sf_object[seq_len(nrow(sf_object)), ]
data.table
:class(sf_object) <- c("sf", "data.frame")
Note that those five steps are integrated in the function
sfhelpers::st_rbindlist()
enabling a painless and fast conversion of a list
of sf
objects to a single sf
object. Moreover st_rbindlist()
has argument
options capable of handling list
s of sf
objects that differ by the names and
positions of their geometry and other attribute columns.
?sfhelpers::st_rbindlist
bind_rows_sf for combining lists of sf objects to single dataframe
rbindlist gets wrong bbox when binding a list of sf data frames
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.