knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/REAsDME-", out.width = "100%" )
sf
) with 'Apache' 'Arrow'sfarrow
is a package for reading and writing Parquet and Feather files with
sf
objects using arrow
in R
.
Simple features are a popular format for representing spatial vector data using
data.frames
and a list-like geometry column, implemented in the R
package
sf
. Apache Parquet files are an
open-source, column-oriented data storage format
(https://parquet.apache.org/) which enable
efficient read/writing for large files. Parquet files are becoming popular
across programming languages and can be used in R
using the package
arrow
.
The sfarrow
implementation translates simple feature data objects using
well-known binary (WKB) format for geometries and reads/writes Parquet/Feather
files. A key goal of the package is for interoperability of the files
(particularly with Python GeoPandas
), so coordinate reference system
information is maintained in a standard metadata format
(https://github.com/geopandas/geo-arrow-spec).
Note to users: this metadata format is not yet stable for production uses and
may change in the future.
sfarrow
is available through CRAN with:
install.packages('sfarrow')
or it can be installed from Github with:
devtools::install_github("wcjochem/sfarrow@main")
Load the library to begin using it.
library(sfarrow)
arrow
packageThe installation requires the Arrow library which should be installed with the
R
package arrow
dependency. However, some systems may need to follow
additional steps to enable full support of that library. Please refer to the
arrow
documentation.
Reading Parquet data of spatial files created with Python GeoPandas
.
# load Natural Earth low-res dataset. # Created in Python with geopandas.to_parquet() path <- system.file("extdata", "world.parquet", package = "sfarrow") world <- st_read_parquet(path) world plot(sf::st_geometry(world))
Writing sf
objects to Parquet format files. These Parquet files created with
sfarrow
can be read within Python using GeoPandas
.
nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet=TRUE) st_write_parquet(obj=nc, dsn=file.path(tempdir(), "nc.parquet")) # read back into R nc_p <- st_read_parquet(file.path(tempdir(), "nc.parquet")) nc_p plot(sf::st_geometry(nc_p))
For additional examples please see the vignettes.
Contributions, questions, ideas, and issue reports are welcome. Please raise an issue to discuss or submit a pull request.
This work benefited from the work by developers in the GeoPandas, Arrow, and r-spatial teams. Thank you to the teams for their excellent, open-source work.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.