Dean Koch 2022-06-03
Mitacs UYRW project
pkern: Fast kriging on gridded datasets
This vignette shows how to use pkern
to interpolate the Meuse river
dataset (part of the sp
package) onto a high resolution grid:
pkern_grid
defines the gridpkern_snap
snaps point data to the gridpkern_fit
fits a covariance model to the datapkern_cmean
computes the kriging predictor and varianceAll plots in this document except the first are generated by
pkern_plot
, which uses base graphics to display rasters and matrices.
Try setting ij=TRUE
in any of the plot calls to see a different
styling used for matrices.
pkern
has no requirements outside of base packages (stats
, utils
,
grDevices
, and graphics
), but terra
and sf
are (strongly)
suggested, and we use them in this vignette to load and manage the
example data.
# load extra dependencies for the vignette
library(sp)
library(sf)
## Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
library(terra)
## terra version 1.4.22
We will look at the “meuse” dataset (Pebesma, 2009) on heavy metal
concentrations in the Meuse river floodplain in the Netherlands. These
data are included with the sp
package that we just loaded, and appear
in this tutorial
pdf
for gstat. Load the data with data(meuse)
and data(meuse.riv)
. I
have a helper function get_meuse
to do this (see source code for this
document).
# load data
meuse = get_meuse()
str(meuse)
## List of 3
## $ soils :Classes 'sf' and 'data.frame': 155 obs. of 15 variables:
## ..$ cadmium : num [1:155] 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
## ..$ copper : num [1:155] 85 81 68 81 48 61 31 29 37 24 ...
## ..$ lead : num [1:155] 299 277 199 116 117 137 132 150 133 80 ...
## ..$ zinc : num [1:155] 1022 1141 640 257 269 ...
## ..$ elev : num [1:155] 7.91 6.98 7.8 7.66 7.48 ...
## ..$ dist : num [1:155] 0.00136 0.01222 0.10303 0.19009 0.27709 ...
## ..$ om : num [1:155] 13.6 14 13 8 8.7 7.8 9.2 9.5 10.6 6.3 ...
## ..$ ffreq : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
## ..$ soil : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 2 1 1 2 ...
## ..$ lime : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
## ..$ landuse : Factor w/ 15 levels "Aa","Ab","Ag",..: 4 4 4 11 4 11 4 2 2 15 ...
## ..$ dist.m : num [1:155] 50 30 150 270 380 470 240 120 240 420 ...
## ..$ geometry:sfc_POINT of length 155; first list element: 'XY' num [1:2] 181072 333611
## ..$ distance: num [1:155, 1] 118 125 232 359 482 ...
## ..$ log_zinc: num [1:155] 6.93 7.04 6.46 5.55 5.59 ...
## ..- attr(*, "sf_column")= chr "geometry"
## ..- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
## .. ..- attr(*, "names")= chr [1:14] "cadmium" "copper" "lead" "zinc" ...
## $ river_poly:sfc_POLYGON of length 1; first list element: List of 1
## ..$ : num [1:413, 1:2] 181505 181478 181452 181420 181388 ...
## ..- attr(*, "class")= chr [1:3] "XY" "POLYGON" "sfg"
## $ river_line:sfc_LINESTRING of length 1; first list element: 'XY' num [1:409, 1:2] 181444 181425 181412 181398 181379 ...
# plot using sf package
plot(meuse[['river_poly']], col='lightblue', reset=FALSE)
plot(meuse[['river_line']], lwd=2, add=TRUE)
plot(meuse[['soils']]['zinc'], pch=16, add=TRUE)
n_meuse = nrow(meuse[['soils']])
We will interpolate the log-transformed values (as in the gstat
vignette). Start by defining a grid and snapping the points to it
# desired resolution in units of metres
gres = c(y=5, x=5)
# snap points, copying values of dependent variable
g_meuse = pkern_snap(meuse[['soils']]['log_zinc'], g=list(gres=gres))
# plot the grid only
pkern_plot(modifyList(g_meuse, list(gval=NA)), col_grid=NA)
# plot with source points indicated over their snapped grid location
pkern_plot(g_meuse, zlab='log(zinc)', reset=FALSE)
plot(sf::st_geometry(meuse[['soils']]), add=TRUE)
Pass this data to pkern_fit to do ordinary kriging
# ordinary kriging: fit isotropic gaussian model by default
fit_result_OK = pkern_fit(g_obs=g_meuse, pars='gau', quiet=TRUE)
By default, a sample semi-variogram is plotted, with the fitted model drawn in transparent blue. Notice that a range of semi-variances can be realized for a given distance because the model is anisotropic.
The fitted model overestimates the (partial sill) variance in this case because there is an unaccounted-for trend. Ordinary kriging estimates spatially constant trends, but here we should expect it to depend on how near we are to the presumed source of the zinc deposits (the Meuse River).
The code below demonstrates creating a linear predictor - the distance to the river.
# useful variables
gdim = g_meuse[['gdim']]
is_obs = !is.na(g_meuse[['gval']])
n_obs = sum(is_obs)
# get distance values for entire grid
g_meuse_sf = pkern_coords(g_meuse, out='sf')
## processing 435240 grid points...
d2r_result = units::drop_units(st_distance(g_meuse_sf, meuse[['river_line']]))
# include both distance and its square root
meuse_predictors = scale(cbind(d2r_result, sqrt(d2r_result)))
# copy the subset of predictors at observed response locations
X = matrix(meuse_predictors[is_obs,], nrow=n_obs)
Fit the model with the linear predictor included. This is called universal kriging
# fit the model and plot semivariogram
fit_result_UK = pkern_fit(g_obs=g_meuse, pars='gau', X=X, quiet=TRUE)
# print the GLS estimates for coefficients of the linear predictor
pars_UK = fit_result_UK$pars
pkern_GLS(g_meuse, pars_UK, X=X, out='b')
## [1] 5.681606 1.238569 -1.884427
These fitted kernels can be visualized using pkern_plot_pars
. This
plots the footprint of correlations around any given central point on
the grid.
# plot the kernel shape
pkern_plot_pars(pars_UK, g_meuse)
# print the actual contents of the parameters list
str(pars_UK)
## List of 4
## $ y :List of 2
## ..$ k : chr "gau"
## ..$ kp: Named num 233
## .. ..- attr(*, "names")= chr "rho"
## $ x :List of 2
## ..$ k : chr "gau"
## ..$ kp: Named num 233
## .. ..- attr(*, "names")= chr "rho"
## $ eps : num 0.0856
## $ psill: num 0.0937
# GLS to estimate the (spatially varying) trend
z_gls = pkern_GLS(g_meuse, pars_UK, X=meuse_predictors, out='z')
g_meuse_gls = modifyList(g_meuse, list(gval=z_gls))
pkern_plot(g_meuse_gls, main='estimated trend component')
Once you have a set of covariance parameters, interpolation becomes very
easy with pkern_cmean
. This computes the kriging predictor, an
interpolator with nice properties like unbiasedness and minimal
variance, under suitable assumptions.
# compute spatial mean
g_meuse_detrend = modifyList(g_meuse, list(gval=g_meuse[['gval']]-z_gls))
z_spat = pkern_cmean(g_meuse_detrend, pars_UK, X=0)
g_meuse_spat = modifyList(g_meuse, list(gval=z_spat))
pkern_plot(g_meuse_spat,
main='estimated spatial component')
# compute UK predictions, masking outside observed range
z_pred = z_gls + z_spat
g_meuse_pred = modifyList(g_meuse, list(gval=z_pred))
pkern_plot(g_meuse_pred,
zlim=log(range(meuse[['soils']][['zinc']])),
zlab='log(zinc)',
main='kriging predictor')
Kriging variance is also computed with pkern_cmean
. This is much
slower to compute than the kriging predictor
# prediction variance
z_var = pkern_cmean(g_meuse_detrend, pars_UK, X=0, out='v', quiet=TRUE)
g_meuse_var = modifyList(g_meuse, list(gval=z_var))
pkern_plot(g_meuse_var, main='kriging variance')
The above plots are for the variable on the log-scale. The unbiasdness property is lost when transforming back to the original scale, but now that we have the kriging variance, we can make a correction:
# prediction bias adjustment from log scale
z_pred2 = exp(z_pred + z_var/2)
g_meuse_pred2 = modifyList(g_meuse, list(gval=z_pred2))
pkern_plot(g_meuse_pred2,
zlab='zinc (ppm)',
zlim=range(meuse[['soils']][['zinc']]),
main='UK predictions (masked to input range)')
This vignette is intended to get users up and running with pkern
by
demonstrating a very simple example on a familiar dataset. In other
vignettes we will look in more detail at how pkern_cmean
actually
works, and how users can modify the workflow to incorporate a trend
model.
This chunk below is used to create the markdown document you’re reading
from the R script file with same name, in this directory. It uses
rmarkdown
to create the md file, then substitutes local image paths
for github URLs so the document will display the images properly on my
repo page.
if(FALSE)
{
# Restart session and run code chunk below to build the markdown file
library(here)
library(rmarkdown)
# make the markdown document and delete the unwanted html
path.input = here('vignettes/meuse_vignette.R')
path.output = here('vignettes/meuse_vignette.md')
path.garbage = here('vignettes/meuse_vignette.html')
rmarkdown::render(path.input, clean=TRUE, output_file=path.output)
unlink(path.garbage)
# substitute local file paths for image files with URLs on github
md.github = gsub('https://github.com/deankoch/pkern/blob/main', 'https://github.com/deankoch/pkern/blob/main', readLines(path.output))
writeLines(md.github, path.output)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.