knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Installation

if (!require("remotes"))
  install.packages("remotes")
library("remotes")
remotes::install_github("ethanplunkett/rasterPrep")

The rasterPrep package fills two distinct needs:

  1. To create fully aligned raster files for modeling, and analysis, from raster or vector data.

  2. To prepares output from models for viewing in GIS software by humans.

I. Preparing rasters for modeling

This section walks through the process of preparing raster data for modeling.

The first step in many raster based GIS analyses is creating a series of raster files that coincide perfectly. That is they have identical resolutions, cell alignments, coordinate reference systems (crs), and extents.

Typically, the source data is a mix of raster and vector files that do not coincide. This workflow demonstrates how convert disparate source data into a set of coinciding TIFF files.

Overview:

  1. Load packages and setup paths
  2. Create a reference raster file with makeRefence()
  3. Use warpToReference() to make existing rasters coincide with the reference
  4. Use rasterizeToReference() to convert vector data to rasters that coincide with the reference, or overlay vector data on raster data.

1. Load packages and setup paths

Load packages

library(rasterPrep)
library(gdalUtilities)  # for ogr2ogr() used to reproject vector Shapefiles
library(terra)  # for crs() and rast()

Input paths

Four example files are included with the rasterPrep package:

  1. input$bound, "Amherst.shp": The polygon boundary for the Town of Amherst, MA extracted from here: https://docs.digital.mass.gov/dataset/massgis-data-community-boundaries-towns It is in Mass Mainland mass state plain projection (EPSG:26986). This defines our study area and output projection.

  2. input$slope, "slope.tif": A raster clipped from https://scholarworks.umass.edu/data/17/ with the original projection and cell alignment preserved. It is in the USA Contiguous Albers Equal Area Conic Projection(ESRI:202003).

  3. input$roads, "roads.shp": A Shapefile containing vector roads, originally from Open Street map but with small roads removed and clipped to Amherst. It's in the same projection as 2.

  4. input$key, "roadClassKey.csv": A csv file with columns "value" , "category", and "color", representing integer values, associated road class names, and a hexadecimal color that should be used to draw it.

input <- list(
  bound = system.file("extdata", "Amherst.shp", package = "rasterPrep"),
  slope = system.file("extdata", "slope.tif", package = "rasterPrep"),
  roads =  system.file("extdata", "roads.shp", package = "rasterPrep"),
  key = system.file("extdata", "roadClassKey.csv", package = "rasterPrep")
)

Output paths

  1. out$ref, "reference.tif" A reference file used to define the extent, CRS, cellsize, and cell alignment, this is created first and then all other files are aligned to it.

  2. out$slopeTemp, slopeTemp.tif : An intermediate slope grid.

  3. out$slope, "slope.tif" : The slope grid, ready for modeling.

  4. out$road$lines, roadLines.shp : Intermediate, transformed vector roads.

  5. out$roadsTemp, "roads1.tif" : Intermediate road raster.

  6. out$roads, "roads.tif" : Binary Roads; 1 for road, 0 not road, or NA outside of study area.

  7. roadClass, "roadClass.tif" : Classified road raster.

outDir <- file.path(tempdir(), "rasterPrepDemo")
dir.create(outDir, showWarnings = FALSE)

out <-
  list(ref = file.path(outDir, "reference.tif"),        # reference
       slopeTemp = file.path(outDir, "slopeTemp.tif"),  # Intermediate slope
       slope = file.path(outDir, "slope.tif"),          # Slope
       roadLines = file.path(outDir, "roadLines.shp"),  # Transformed roads
       roadsTemp = file.path(outDir, "roads1.tif"),     # Intermediate roads
       roads  = file.path(outDir, "roads.tif"),         # Binary Roads
       roadClass  = file.path(outDir, "roadClass.tif")) # Road class
# Extra cleanup for vignette, to cleanup old output when rerunning.
# This deletes the entire contents of outDir!  Do not add to your workflow.
unlink(outDir, recursive = TRUE)
dir.create(outDir, showWarnings = FALSE)

2. Create a reference raster

You may already have a raster that you would like all your other data to coincide with. If so use that as your reference.

If you do not have a reference raster then we can create one from a polygon that delineates a buffered study area boundary that has the desired CRS. The buffer can be added to include surrounding cells that influence your study area.

makeReference(polyFile = input$bound, destination = out$ref, cellsize = 30)

The optional alignTo argument determines the cell alignment (AKA snapping), there are two options to choose from:

  1. The default ("alignto = "origin") means that the edges of the cells will all fall on integer multiples of the cell size.

  2. If you have an important raster data set in your desired CRS and resolution you may set your reference cell alignment to match it so that there is no resampling or transformation required for that data set. This is especially helpful for categorical data (like landcover) which is particularly hard to resample without loss of information. In this case you would set alignTo = "reference" and then set the reference argument to the path of a raster with the desired cell alignment. In makeReference unlike other functions in rasterPrep the reference will only be used for cell alignment.

With either alignment scheme the resulting raster is going to have an extent just big enough to include all the pixels that overlap the polyFile argument.

3. Transform and resample raster data to match the reference

The line of code below aligns the input slope data with our reference raster, it transforms, crops, extends, and resamples as necessary to produce a raster that matches the reference resolution and extent.

Slope is continuous so the "bilinear" resampling method is appropriate. Use the default nearest neighbor method (method = "near") for categorical data, like landcover. If your input raster has much smaller cells than the reference method = "average" is a good choice.

warpToReference(input$slope, out$slopeTemp, reference = out$ref,
                method = "bilinear")

The next step is optional, it assigns NA to areas outside of the clip polygon. It is a separate step to avoid small errors in the clipping that occur when transforming or resampling in the same step as clipping.

warpToReference(out$slopeTemp, out$slope, reference =  out$ref,
                clip = input$bound)

Delete the intermediate file.

# Delete intermediate file
deleteTif(out$slopeTemp)

Visualize the result:

slope <- rast(out$slope)
plot(slope)

4. Rasterize vector to match the reference

Transform vector inputs to match the reference. This code uses the 'ogr2ogr()' function within the gdalUtilities R package. You could also use the sf package to read in the vector data, reproject, and write out to a new Shapefile.

refProj <- terra::crs(terra::rast(out$ref))  # well known text of projection
projFile <- file.path(outDir, "refproj.txt")
cat(refProj, file = projFile)   # well known text in file
gdalUtilities::ogr2ogr(input$roads, out$roadLines, t_srs = projFile)

Road class

Create a raster that contains the values from the "ROADCLASS"" attribute from the roads shapefile in cells underlying the roads and and NA everywhere else.

rasterizeToReference(out$roadLines, destination = out$roadClass,
                     reference = out$ref, attribute = "ROADCLASS",
                     type = "byte")

Roads

Create a road raster with NA outside the study area boundary, and 0 or 1 inside the boundary to indicate if there is a road.

Burn zero everywhere within the study area:

rasterizeToReference(input$bound, out$roadsTemp, reference = out$ref,
                     burn = 0)

Burn in ones everywhere under roads. Setting allTouched=TRUE means that all cells touched by the road will have 1; this creates orthogonal connections along the road which can be useful in landscape analysis.

rasterizeToReference(out$roadLines, destination = out$roadsTemp,
                     allTouched = TRUE, burn = 1)

Clip to eliminate the roads outside the study area.

warpToReference(out$roadsTemp, out$roads, reference = out$ref,
                clip = input$bound)

Delete intermediate file and visualize.

deleteTif(out$roadsTemp)
roads <- rast(out$roads)
plot(roads)

II. Preparing raster files for viewing with GIS software

geoTIFFs that are tiled, have overviews (AKA pyramids), stats, histograms, and possibly value attribute tables (VATs) will load faster in some GIS software than files that don't; compression reduces file size.

Most of those things either hinder reading and processing with the terra package or add extra files that don't help. Do not use these functions on model input, instead use them on model output in preparation for viewing, or distribution.

Setup final paths

Create a "final" directory for these files.

finalDir <- file.path(outDir, "final")
dir.create(finalDir, showWarnings = FALSE)
final <- list()
final$slope <- file.path(finalDir,  "slope.tif")
final$roads <- file.path(finalDir,  "roads.tif")
final$roadclass <- file.path(finalDir, "roadclass.tif")

For many data sets a single call to makeNiceTiff() will create a copy of the input raster that is optimized for viewing.

The code below prepares the slope grid produced above for viewing. Slope is continuous so we set overviewResample="average" so that each overview cell will be assigned the average value of the smaller cells it covers. This is different than the default of "nearest" used by many GIS systems.

makeNiceTif(out$slope, final$slope, overviewResample = "average")

Adding a Value Attribute Table (VAT)

A value attribute table (VAT) is a table of all the unique values in a raster file, the count of cells of each, and any additional tabular data that is associated with each value. It's stored in a "sidecar" file which has the full raster name and extension plus ".vat.dbf" so for "slope.tif" the vat is "slope.tif.vat.dbf".

There are two reasons you might want to add a VAT:

  1. Adding a vat tells GIS software that the data is categorical and it will default to a categorical rather than gradient display when loaded.

    Do not add a VAT to continuous data (like slope).

    Create a VAT while preparing a file for export by setting vat=TRUE:

makeNiceTif(source = out$roads, destination = final$roads,
            overwrite = TRUE, overviewResample = "near", stats = FALSE,
            vat = TRUE)
  1. Another reason to create a VAT is to associate additional information with each cell value - like the name of the landcover class, or road class.

    We'll see an example below of this at the end of the next section.

Color Tables

Color tables allow associating a specific color with every class, and as such are only suitable to categorical data and make a lot of sense for landcover where we might use greens for natural uplands, blues for water, grays and blacks for roads, etc.

There are a couple of limitations to the color tables supported here.

  1. The TIFF must be byte encoded.
  2. Every value between 0 and the maximum value is going to be in the table so ideally used values are sequential.

Despite these limitations color tables allow you to programmatically set the color and labels for categories in a TIFF file from within R, and as such are really useful to process output from models that produce landcover TIFFs.

The key needs to be a data frame with three named columns:

key <- read.csv(input$key, stringsAsFactors = FALSE)
print(key)

Use addColorTable() to create a small .vrt (gdal virtual format) file that references the input TIFF and adds color table information.

vrt.file <- addColorTable(out$roadClass, table = key) # returns path to file

The .vrt file can be opened directly with ArcGIS or QGIS but it doesn't contain the raster data.

Create a geoTIFF that has the color table fully embedded in it while adding compression, tiling, stats, histograms, and overviews by calling makeNiceTiff() on the .vrt file. The default overviewResampleof "near" is appropriate for categorical data.

makeNiceTif(source = vrt.file, destination = final$roadclass,
            overwrite = TRUE, stats = TRUE)

Add attributes

The color table itself embeds class names but primarily for symbology. The addVat() function can be used to associate multiple attribute columns with each value in the grid as well as generate a count of how many times each value occurs.

addVat(x = final$roadclass, attributes = key)

The result is a raster that will load quickly, display specific colors and labels for each class, and that contains class attribute data.



ethanplunkett/rasterPrep documentation built on Sept. 17, 2024, 1:05 p.m.