In jeffreyhanson/geotrellis: geotrellis: spatial data analysis using geotrellis

devtools::load_all()

The geotrellis R package was an experiment to see if R could leverage the Scala data processing engine geotrellis to process spatial data faster than the raster R package. This vignette outlines the functions implemented in the geotrellis R package, and provides some benchmark results against the raster R package. This R package is experimental and not recommended for real analysis. Current benchmarks indicate that little to no benefits will be gained using this package.

Spatial data is stored in a gt_RasterLayer object. This R6 class contains all low-level the definitions for reading, writing, processing, and analyzing data. However, users should not interact with this class' methods directly. Instead, users should use the S4 methods provided. The usage of these S4 methods is outlined below.

Reading/writing data

Spatial data is read using the gt_raster method. This method can load data using a character file path for a GeoTiff file. Additionally, users can coerce a raster::RasterLayer-class object. Note that the package only support single-band layer raster data.

# load in data using file path
g <- gt_raster(system.file('extdata/test.tif', package='geotrellis'))

# load in data using a RasterLayer object
g <- gt_raster(raster(system.file('extdata/test.tif', package='geotrellis')))
````

Once data has been loaded into _R_, users can access the metadata associated with the data. The methods used to access the metadata follow the conventions used in the _raster R_ package.

```r
print(g)  # print summary of object
crs(g)    # coordinate reference system
extent(g) # spatial extent of data
res(g)    # resolution
ncell(g)  # number of data cells
ncol(g)   # number of columns
nrow(g)   # number of rows

Users can also convert the data into native R objects and raster::RasterLayer-class objects.

values(g)    # convert data as a numeric vector
as.matrix(g) # convert data to a matrix
as.raster(g) # convert data to a RasterLayer

Finally, data can be saved to disk using the gt_writeRaster function.

# save data to disk in a GeoTiff format
gt_writeRaster(g, tempfile(fileext='.tif'))

Geoprocessing

The geotrellis R package contains several geoprocessing methods.

To spatially align data in the same coordinate reference system, the gt_resample can be used:

# create new layer with different spatial resolution
l <- gt_raster(raster::disaggregate(as.raster(g), 2, method=''))

# resample data to new resolution
gt_resample(g, l, method='ngb')

To spatially project data to a different coordinate reference system, the gt_projectRaster method can be used. In addition to an input gt_RasterLayer object, this method accepts either: (1) a sp::CRS coordinate refence system object and a numeric resolution or (2) a gt_RasterLayer with spatial properties that are used to reproject the data.

gt_projectRaster(g, sp::CRS('+init=epsg:3395'), res=50000, method='ngb')

To extract contiguous chunk from a gt_RasterLayer object, the gt_crop method can be used. This method accepts either an raster::Extent-class object or a gt_RasterLayer object.

gt_crop(g, extent(raster::extent(c(0, 3, 2, 6))))

To set cells in in a gt_RasterLayer object to another value, the gt_mask method can be used. This defaults to updating cells in the argument to x to be NA where they are NA in the mask layer. The maskvalue and updatevalue arguments can used to modify this behavior though.

# create mask layer
m <- as.raster(g)
m <- raster::setValues(m, sample(c(1,NA), ncell(m), replace=TRUE))
m <- gt_raster(m)

# run mask analysis
gt_mask(g, m)

Statistics

This package offers two statistical methods for analyzing data in a gt_RasterLayer object. Note that there is currently a bug in the geotrellis code that causes medians to be calculated incorrectly. Additionally, note that unlike the functions in the raster R package, these functions calculate the population standard deviation, not the sample standard deviation.

The gt_cellStats method can be used to calculate statistics using all the cells in the raster data.

gt_cellStats(g)

The the gt_zonal method can be used to calculate statistics in zones, as specified in a second raster.

# make layer with zones
z <- as.raster(g)
z <- raster::setValues(z, sample(1:2, ncell(z), replace=TRUE))
z <- gt_raster(z)

# run zonal statistics
gt_zonal(g, z)

Benchmarks

Finally, the geotrellis R package provides a benchmark function to automate the benchmarking of all these functions and compare them to their counterparts in the raster package.

# run benchmark
b <- benchmark(ncell=c(1e+1, 1e+2, 1e+3, 1e+4, 1e+5), times=10L)
# print benchmark
print(b)
# plot benchmark
plot(b)

Acknowledgments

I would like to thank to the geotrellis developers for their assistance.

jeffreyhanson/geotrellis documentation built on May 19, 2019, 4 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jeffreyhanson/geotrellis
geotrellis: spatial data analysis using geotrellis

In jeffreyhanson/geotrellis: geotrellis: spatial data analysis using geotrellis

Reading/writing data

Geoprocessing

Statistics

Benchmarks

Acknowledgments

R Package Documentation

Browse R Packages

We want your feedback!

jeffreyhanson/geotrellis geotrellis: spatial data analysis using geotrellis

In jeffreyhanson/geotrellis: geotrellis: spatial data analysis using geotrellis

Reading/writing data

Geoprocessing

Statistics

Benchmarks

Acknowledgments

R Package Documentation

Browse R Packages

We want your feedback!

jeffreyhanson/geotrellis
geotrellis: spatial data analysis using geotrellis