knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
if (!require("remotes")) install.packages("remotes") library("remotes") remotes::install_github("ethanplunkett/rasterPrep")
The rasterPrep package fills two distinct needs:
To create fully aligned raster files for modeling, and analysis, from raster or vector data.
To prepares output from models for viewing in GIS software by humans.
This section walks through the process of preparing raster data for modeling.
The first step in many raster based GIS analyses is creating a series of raster files that coincide perfectly. That is they have identical resolutions, cell alignments, coordinate reference systems (crs), and extents.
Typically, the source data is a mix of raster and vector files that do not coincide. This workflow demonstrates how convert disparate source data into a set of coinciding TIFF files.
Overview:
makeRefence()
warpToReference()
to make existing rasters coincide with the referencerasterizeToReference()
to convert vector data to rasters that coincide
with the reference, or overlay vector data on raster data.Load packages
library(rasterPrep) library(gdalUtilities) # for ogr2ogr() used to reproject vector Shapefiles library(terra) # for crs() and rast()
Input paths
Four example files are included with the rasterPrep package:
input$bound
, "Amherst.shp": The polygon boundary for the Town of Amherst, MA extracted from here:
https://docs.digital.mass.gov/dataset/massgis-data-community-boundaries-towns
It is in Mass Mainland mass state plain projection (EPSG:26986).
This defines our study area and output projection.
input$slope
, "slope.tif": A raster clipped from https://scholarworks.umass.edu/data/17/ with
the original projection and cell alignment preserved. It is in the USA Contiguous Albers Equal Area Conic Projection(ESRI:202003).
input$roads
, "roads.shp": A Shapefile containing vector roads, originally from Open Street map but with small roads removed and clipped to Amherst. It's in the same projection as 2.
input$key
, "roadClassKey.csv": A csv file with columns "value" , "category", and "color", representing integer values, associated road class names, and a hexadecimal color that should be used to draw it.
input <- list( bound = system.file("extdata", "Amherst.shp", package = "rasterPrep"), slope = system.file("extdata", "slope.tif", package = "rasterPrep"), roads = system.file("extdata", "roads.shp", package = "rasterPrep"), key = system.file("extdata", "roadClassKey.csv", package = "rasterPrep") )
Output paths
out$ref
, "reference.tif" A reference file used to define the extent,
CRS, cellsize, and cell alignment, this is created first and then all other
files are aligned to it.
out$slopeTemp
, slopeTemp.tif : An intermediate slope grid.
out$slope
, "slope.tif" : The slope grid, ready for modeling.
out$road$lines
, roadLines.shp : Intermediate, transformed vector roads.
out$roadsTemp
, "roads1.tif" : Intermediate road raster.
out$roads
, "roads.tif" : Binary Roads; 1 for road, 0 not road, or NA
outside of study area.
roadClass
, "roadClass.tif" : Classified road raster.
outDir <- file.path(tempdir(), "rasterPrepDemo") dir.create(outDir, showWarnings = FALSE) out <- list(ref = file.path(outDir, "reference.tif"), # reference slopeTemp = file.path(outDir, "slopeTemp.tif"), # Intermediate slope slope = file.path(outDir, "slope.tif"), # Slope roadLines = file.path(outDir, "roadLines.shp"), # Transformed roads roadsTemp = file.path(outDir, "roads1.tif"), # Intermediate roads roads = file.path(outDir, "roads.tif"), # Binary Roads roadClass = file.path(outDir, "roadClass.tif")) # Road class
# Extra cleanup for vignette, to cleanup old output when rerunning. # This deletes the entire contents of outDir! Do not add to your workflow. unlink(outDir, recursive = TRUE) dir.create(outDir, showWarnings = FALSE)
You may already have a raster that you would like all your other data to coincide with. If so use that as your reference.
If you do not have a reference raster then we can create one from a polygon that delineates a buffered study area boundary that has the desired CRS. The buffer can be added to include surrounding cells that influence your study area.
makeReference(polyFile = input$bound, destination = out$ref, cellsize = 30)
The optional alignTo
argument determines the cell alignment (AKA snapping),
there are two options to choose from:
The default ("alignto = "origin"
) means that the edges of the cells will all
fall on integer multiples of the cell size.
If you have an important raster data set in your desired CRS and resolution
you may set your reference cell alignment to match it so that there is
no resampling or transformation required for that data set. This is especially
helpful for categorical data (like landcover) which is particularly hard to
resample without loss of information.
In this case you would set alignTo = "reference"
and then set the reference
argument to the path of a raster with the desired cell alignment. In
makeReference
unlike other functions in rasterPrep the reference will only
be used for cell alignment.
With either alignment scheme the resulting raster is going to have an
extent just big enough to include all the pixels that overlap the
polyFile
argument.
The line of code below aligns the input slope data with our reference raster, it transforms, crops, extends, and resamples as necessary to produce a raster that matches the reference resolution and extent.
Slope is continuous so the "bilinear"
resampling method is appropriate.
Use the default nearest neighbor method (method = "near"
) for
categorical data, like landcover. If your input raster has much smaller cells
than the reference method = "average"
is a good choice.
warpToReference(input$slope, out$slopeTemp, reference = out$ref, method = "bilinear")
The next step is optional, it assigns NA to areas outside of the clip polygon. It is a separate step to avoid small errors in the clipping that occur when transforming or resampling in the same step as clipping.
warpToReference(out$slopeTemp, out$slope, reference = out$ref, clip = input$bound)
Delete the intermediate file.
# Delete intermediate file deleteTif(out$slopeTemp)
Visualize the result:
slope <- rast(out$slope) plot(slope)
Transform vector inputs to match the reference. This code uses the 'ogr2ogr()' function within the gdalUtilities R package. You could also use the sf package to read in the vector data, reproject, and write out to a new Shapefile.
refProj <- terra::crs(terra::rast(out$ref)) # well known text of projection projFile <- file.path(outDir, "refproj.txt") cat(refProj, file = projFile) # well known text in file gdalUtilities::ogr2ogr(input$roads, out$roadLines, t_srs = projFile)
Road class
Create a raster that contains the values from the "ROADCLASS"" attribute from the roads shapefile in cells underlying the roads and and NA everywhere else.
rasterizeToReference(out$roadLines, destination = out$roadClass, reference = out$ref, attribute = "ROADCLASS", type = "byte")
Roads
Create a road raster with NA outside the study area boundary, and 0 or 1 inside the boundary to indicate if there is a road.
Burn zero everywhere within the study area:
rasterizeToReference(input$bound, out$roadsTemp, reference = out$ref, burn = 0)
Burn in ones everywhere under roads. Setting allTouched=TRUE
means that all cells touched by the road will have 1; this creates orthogonal
connections along the road which can be useful in landscape analysis.
rasterizeToReference(out$roadLines, destination = out$roadsTemp, allTouched = TRUE, burn = 1)
Clip to eliminate the roads outside the study area.
warpToReference(out$roadsTemp, out$roads, reference = out$ref, clip = input$bound)
Delete intermediate file and visualize.
deleteTif(out$roadsTemp) roads <- rast(out$roads) plot(roads)
geoTIFFs that are tiled, have overviews (AKA pyramids), stats, histograms, and possibly value attribute tables (VATs) will load faster in some GIS software than files that don't; compression reduces file size.
Most of those things either hinder reading and processing with the terra package or add extra files that don't help. Do not use these functions on model input, instead use them on model output in preparation for viewing, or distribution.
Create a "final" directory for these files.
finalDir <- file.path(outDir, "final") dir.create(finalDir, showWarnings = FALSE) final <- list() final$slope <- file.path(finalDir, "slope.tif") final$roads <- file.path(finalDir, "roads.tif") final$roadclass <- file.path(finalDir, "roadclass.tif")
For many data sets a single call to makeNiceTiff()
will create a copy of
the input raster that is optimized for viewing.
The code below prepares the slope grid produced above for viewing.
Slope is continuous so we set overviewResample="average"
so that each
overview cell will be assigned the average value of the smaller cells it
covers. This is different than the default of "nearest" used by many
GIS systems.
makeNiceTif(out$slope, final$slope, overviewResample = "average")
A value attribute table (VAT) is a table of all the unique values in a raster file, the count of cells of each, and any additional tabular data that is associated with each value. It's stored in a "sidecar" file which has the full raster name and extension plus ".vat.dbf" so for "slope.tif" the vat is "slope.tif.vat.dbf".
There are two reasons you might want to add a VAT:
Adding a vat tells GIS software that the data is categorical and it will default to a categorical rather than gradient display when loaded.
Do not add a VAT to continuous data (like slope).
Create a VAT while preparing a file for export by setting vat=TRUE
:
makeNiceTif(source = out$roads, destination = final$roads, overwrite = TRUE, overviewResample = "near", stats = FALSE, vat = TRUE)
Another reason to create a VAT is to associate additional information with each cell value - like the name of the landcover class, or road class.
We'll see an example below of this at the end of the next section.
Color tables allow associating a specific color with every class, and as such are only suitable to categorical data and make a lot of sense for landcover where we might use greens for natural uplands, blues for water, grays and blacks for roads, etc.
There are a couple of limitations to the color tables supported here.
Despite these limitations color tables allow you to programmatically set the color and labels for categories in a TIFF file from within R, and as such are really useful to process output from models that produce landcover TIFFs.
The key needs to be a data frame with three named columns:
value
: the integer value used to represent the category in the TIFF.category
: (optional) the category name associated with each valuecolor
: the hexadecimal color to displaykey <- read.csv(input$key, stringsAsFactors = FALSE) print(key)
Use addColorTable()
to create a small .vrt
(gdal virtual format)
file that references the input TIFF and adds color table information.
vrt.file <- addColorTable(out$roadClass, table = key) # returns path to file
The .vrt
file can be opened directly with ArcGIS or QGIS but it doesn't
contain the raster data.
Create a geoTIFF that has the color table fully embedded in it while adding
compression, tiling, stats, histograms, and overviews by calling
makeNiceTiff()
on the .vrt
file. The default overviewResample
of
"near"
is appropriate for categorical data.
makeNiceTif(source = vrt.file, destination = final$roadclass, overwrite = TRUE, stats = TRUE)
The color table itself embeds class names but primarily for symbology.
The addVat()
function can be used to associate multiple attribute columns
with each value in the grid as well as generate a count of how many times
each value occurs.
addVat(x = final$roadclass, attributes = key)
The result is a raster that will load quickly, display specific colors and labels for each class, and that contains class attribute data.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.