devtools::load_all()
The geotrellis R package was an experiment to see if R could leverage the Scala data processing engine geotrellis to process spatial data faster than the raster R package. This vignette outlines the functions implemented in the geotrellis R package, and provides some benchmark results against the raster R package. This R package is experimental and not recommended for real analysis. Current benchmarks indicate that little to no benefits will be gained using this package.
Spatial data is stored in a gt_RasterLayer
object. This R6 class contains all low-level the definitions for reading, writing, processing, and analyzing data. However, users should not interact with this class' methods directly. Instead, users should use the S4 methods provided. The usage of these S4 methods is outlined below.
Spatial data is read using the gt_raster
method. This method can load data using a character
file path for a GeoTiff file. Additionally, users can coerce a raster::RasterLayer-class
object. Note that the package only support single-band layer raster data.
# load in data using file path g <- gt_raster(system.file('extdata/test.tif', package='geotrellis')) # load in data using a RasterLayer object g <- gt_raster(raster(system.file('extdata/test.tif', package='geotrellis'))) ```` Once data has been loaded into _R_, users can access the metadata associated with the data. The methods used to access the metadata follow the conventions used in the _raster R_ package. ```r print(g) # print summary of object crs(g) # coordinate reference system extent(g) # spatial extent of data res(g) # resolution ncell(g) # number of data cells ncol(g) # number of columns nrow(g) # number of rows
Users can also convert the data into native R objects and raster::RasterLayer-class
objects.
values(g) # convert data as a numeric vector as.matrix(g) # convert data to a matrix as.raster(g) # convert data to a RasterLayer
Finally, data can be saved to disk using the gt_writeRaster
function.
# save data to disk in a GeoTiff format gt_writeRaster(g, tempfile(fileext='.tif'))
The geotrellis R package contains several geoprocessing methods.
gt_resample
can be used:# create new layer with different spatial resolution l <- gt_raster(raster::disaggregate(as.raster(g), 2, method='')) # resample data to new resolution gt_resample(g, l, method='ngb')
gt_projectRaster
method can be used. In addition to an input gt_RasterLayer
object, this method accepts either: (1) a sp::CRS
coordinate refence system object and a numeric
resolution or (2) a gt_RasterLayer
with spatial properties that are used to reproject the data.gt_projectRaster(g, sp::CRS('+init=epsg:3395'), res=50000, method='ngb')
gt_RasterLayer
object, the gt_crop
method can be used. This method accepts either an raster::Extent-class
object or a gt_RasterLayer
object.gt_crop(g, extent(raster::extent(c(0, 3, 2, 6))))
gt_RasterLayer
object to another value, the gt_mask
method can be used. This defaults to updating cells in the argument to x
to be NA
where they are NA
in the mask layer. The maskvalue
and updatevalue
arguments can used to modify this behavior though.# create mask layer m <- as.raster(g) m <- raster::setValues(m, sample(c(1,NA), ncell(m), replace=TRUE)) m <- gt_raster(m) # run mask analysis gt_mask(g, m)
This package offers two statistical methods for analyzing data in a gt_RasterLayer
object. Note that there is currently a bug in the geotrellis code that causes medians to be calculated incorrectly. Additionally, note that unlike the functions in the raster R package, these functions calculate the population standard deviation, not the sample standard deviation.
The gt_cellStats
method can be used to calculate statistics using all the cells in the raster data.
gt_cellStats(g)
The the gt_zonal
method can be used to calculate statistics in zones, as specified in a second raster.
# make layer with zones z <- as.raster(g) z <- raster::setValues(z, sample(1:2, ncell(z), replace=TRUE)) z <- gt_raster(z) # run zonal statistics gt_zonal(g, z)
Finally, the geotrellis R package provides a benchmark
function to automate the benchmarking of all these functions and compare them to their counterparts in the raster package.
# run benchmark b <- benchmark(ncell=c(1e+1, 1e+2, 1e+3, 1e+4, 1e+5), times=10L) # print benchmark print(b) # plot benchmark plot(b)
I would like to thank to the geotrellis developers for their assistance.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.