mss
PackageThis vignette introduces an R package for meaningful spatial statistics,
library(mss)
The package introduces three classes for spatial data: SField
for representing geostatistical data (spatial fields), SObjects
for point patterns (spatial objects), and SLattice
for lattice
(areal) data (spatial lattices). The abbreviations follow this
paper
(which is under review).
SExtent
: extent (window) of field or point patternSField
and SObjects
objects contain an SExtent
, representing
the domain or extent of the field, or the window of observation
for a point pattern.
The window class defines an area:
showClass("SExtent")
The area can be represented either by a grid, as in
library(sp) demo(meuse, ask = FALSE, echo = FALSE) w2 = SExtent(meuse.grid) # note the grid is passed, not the polygons
or by a polygon or set of polygons, as in
w1 = SExtent(meuse.area)
SField
: geostatistical dataSFields represent continuous functions, which over a domain $D$ have a value at every point $s \in D$.
showClass("SField")
Observations on fields variables are often done on irregular
point locations. Such a field variable is created from a
SpatialPointsDataFrame
object:
sf = SField(meuse, meuse.grid)
where, here, meuse.grid
defines the domain for the
field. Alternatively, point observations can be gridded, and
defined as a field by
sf = SField(meuse.grid, meuse.grid, cellsArePoints = TRUE)
Note that the first argument defines the point observations, and the second argument the domain: the first argument is treated as a (gridded) set of points (grid cell centres), the second as a gridded set of areas (cells).
We can alternatively assume that each grid cell represents an area with constant point values. Such a field is defined by:
sf = SField(meuse.grid, meuse.grid, cellsArePoints = FALSE)
This field is completely known, for every point location. Note
that grid cell values are constant throughout the cell, and are
not conceived as cell aggregate (e.g. average, or otherwise
convoluted) values. For that case, SpatialAggregates
are used.
Sets of points can not only be defined by grid cells, but also through lines, or polygons. For instance, points on a contour line could form a sample of a field:
library(maptools) library(rgeos) cl = ContourLines2SLDF(contourLines(as.image.SpatialGridDataFrame( meuse.grid["dist"]))) proj4string(cl) = CRS(proj4string(meuse.grid)) sf.lines = SField(cl, meuse.grid)
or, alternatively, as a set of polygons; in this example, a single polygon with a constant value:
pol = addAttrToGeom(meuse.area, data.frame(value = 3), FALSE) sf.pol = SField(pol, meuse.area)
SObjects
: point patterns or objectsThe SObjects
class can be used to represent objects in a space,
where the window of observation reflects the area for which all
points or objects are available:
showClass("SObjects")
We can create a point pattern e.g. for the sample locations of the meuse
data set:
pts = geometry(meuse) se.pts = SObjects(pts, meuse.area)
but also areas can indicate entities:
se.area = SObjects(meuse.area, meuse.area)
SLattice
: areal dataAggregations refer to values, measured or computed as the aggregate over a set of (point) values, e.g. the mean, maximum or minimum value. As such, aggregations need to be defined for sets of points, i.e. for lines, polygons, or grid cells:
showClass("SLattice")
Sets of points can be represented by anything but points: lines, grids, and polygons:
sa.1 = SLattice(cl) sa.3 = SLattice(meuse.grid) sa.2 = SLattice(pol)
Spatial grid data can be conceived as having values for 1. points, at grid cell centres 2. constant values throughout each grid cell 3. aggregated values over the grid cell region The difference is illustrated by querying each at two points, one coinciding with a grid cell centre, one not coinciding. The two points are indicated by red circles:
xy = rbind(coordinates(meuse)[1,], coordinates(meuse.grid)[10,]) pts = SpatialPoints(xy, CRS(proj4string(meuse.grid))) plot(as(meuse.grid[1:21,], "SpatialPolygons")) plot(pts, col = 'red', pch = 16, add = TRUE) plot(meuse.grid[1:21,], add = TRUE) text(coordinates(pts), c("1", "2"), pos = 2)
Querying a spatial field where cells are points gives non-missing values only when the point coincides with the grid cell centre:
f1 = SField(meuse.grid, meuse.area, cellsArePoints = TRUE) pts = SField(pts, meuse.grid) over(pts, f1)
Querying a spatial field where cells are constant valued areas gives the grid cell values of the cells where the point is in:
f2 = SField(meuse.grid, meuse.area, cellsArePoints = FALSE) over(pts, f2) # matches pts to the complete grid cell
Querying a grid where cell values are aggregations does not answer and generates a warning:
a = SLattice(meuse.grid) over(pts, a)
We can predict to points, laid out regularly, inside the polygon of the field domain:
m = SField(meuse, meuse.area) v = gstat::vgm(.6, "Sph", 900, .05) # variogram model i1 = interpolate(log(zinc)~1, m, model = v) spplot(i1, "var1.pred")
To show that these really concern points, plotting for a smaller region gives actual point symbols:
spplot(i1[1:300,], "var1.pred")
Alternatively, we can predict values on a pre-defined grid, taken from the domain:
m = SField(meuse, meuse.grid) i2 = interpolate(log(zinc)~1, m, model = v) print(as(i2[1:3,], "data.frame"), digits = 10) # point support
From point values, we can predict the polygon area mean value:
library(rgeos) m3 = SLattice(meuse.area) i3 = interpolate(log(zinc)~1, m, m3, model = v) # interpolate from point TO polygon support i3$var1.pred # a single value, estimate of mean for the whole area mean(log(m$zinc)) # sample mean mean(i1$var1.pred) # mean of predictions, similar to i2$var1.pred
or predict values for grid cell averages (block kriging):
m4 = SLattice(meuse.grid) i4 = interpolate(log(zinc)~1, m, m4, model = v) # interpolate from point TO feature (cell) support print(as(i4[1:3,], "data.frame"), digits = 10) # grid cells support spplot(i4, "var1.pred")
Suppose we have grid data on a 3 x 3 grid, and want to estimate on a new location, indicated by a red circle:
set.seed(131) gt = GridTopology(cellcentre.offset = c(0,0), cellsize = c(1,1), cells.dim = c(3,3)) sgdf = SpatialGridDataFrame(SpatialGrid(gt), data.frame(r = round(runif(9, max = 10), 1))) plot(as(sgdf, "SpatialPolygons"), axes = TRUE) text(coordinates(sgdf), labels = sgdf[[1]]) pt.red = SpatialPoints(cbind(0.25, 0.25)) points(pt.red, col = 'red', pch = 1)
The first case is when the grid cell data represent
point support values at the grid cell centre, which
is identical to case i1
above:
sf = SField(as(sgdf, "SpatialPointsDataFrame"), sgdf) vm = gstat::vgm(1, "Exp", 1) i5 = interpolate(r~1, sf, SField(pt.red, sgdf), model = vm) as(i5[1,], "data.frame")
The second case, grid cells represent constant area with point support value, meaning that every point location within a grid cell has the value of the grid cell assigned:
sf = SField(sgdf, sgdf, cellsArePoints = FALSE) i6 = interpolate(r~1, sf, SField(pt.red, sgdf), model = vm) as(i6[1,], "data.frame") # does not interpolate, but queries grid cell
as case i4
above, from point values we can predict a mean value for a grid cell by
sf = SField(sgdf, sgdf, cellsArePoints = TRUE) gt = GridTopology(cellcentre.offset = c(.25,.25), cellsize = c(1,1), cells.dim = c(1,1)) grd.red = SpatialGrid(gt) i7 = interpolate(r~1, sf, SLattice(grd.red), model = vm) i7@observations@data
sp = as(sgdf, "SpatialPolygonsDataFrame") gf = SLattice(sp) kr = interpolate(r~1, gf, SField(pt.red, sp), model = vm) kr@observations[[1]] library(gstat) krige0(r ~ 1, sp, pt.red, vgmArea, vgm = vm) # check
sp = as(sgdf, "SpatialPolygonsDataFrame") gr = as(grd.red, "SpatialPolygons") grd = addAttrToGeom(gr, data.frame(r=0), FALSE) plot(as(sgdf, "SpatialPolygons"), axes = TRUE) text(coordinates(sgdf), labels = sgdf[[1]]) plot(grd, add = TRUE, border = 'red') kr = interpolate(r~1, gf, SLattice(grd), model = vm) kr@observations[[1]] krige0(r ~ 1, sp, grd, vgmArea, vgm = vm) # check
For SpatialEntity
objects, we can estimate the density of objects; this
method reuses MASS:kde2d
:
e = SObjects(meuse, meuse.area) d = density(e, newdata = meuse.grid) lt = list(list("sp.polygons", meuse.area, border = 'black', first=FALSE), list("sp.points", meuse, col = grey(.5), pch = 16)) class(d) spplot(d, sp.layout = lt)
The reason why this returns a SLattice
is that the
values computed (densities) do not correspond to quantities that are
measurable at point locations where they are registered (grid cells).
In fact, they are aggregated values for areas even much larger than
the grid cells.
SField
datam = SField(meuse["zinc"], meuse.area) a = aggregate(m, SLattice(meuse.area), mean) # no warning a[[1]] a = aggregate(m, SLattice(meuse.area), sum) # warns: a[[1]]
SpatialEntity
datam = SObjects(meuse["zinc"], meuse.area) a = aggregate(m, SLattice(meuse.area), mean) # warns: a[[1]] a = aggregate(m, SLattice(meuse.area), sum) # no warning a[[1]]
Pebesma et al. (2014) make assertions A1-A7.
A prediction is meaningful, if it provides an estimate for a potential observation Prediction on a field variable generates no warning.
Where prediction of a SField
varialbe generates no warning,
m = SField(meuse, meuse.area) # data are point support res = interpolate(log(zinc)~1, m) # interpolate from point TO point support
prediction of a point pattern variable generates a warning:
m = SObjects(meuse, meuse.area) sa = SLattice(meuse.grid) res = interpolate(log(zinc)~1, m, sa) # interpolate point pattern variable: warns
Summing up values over an area is meaningful, if the observed window corresponds to the target geometry (grouping predicate) of the aggregation.
For SpatialEntity
data, spatial aggregation over an area larger than the
observation window generates a warning:
x = bbox(meuse)[1,] y = bbox(meuse)[2,] bb = Polygon(cbind(c(x[1],x[1],x[2],x[2],x[1]), c(y[1],y[2],y[2],y[1],y[1]))) bbx = SpatialPolygons(list(Polygons(list(bb),"bbox_meuse")), proj4string = CRS(proj4string(meuse))) m = SObjects(meuse["zinc"], meuse.area) a = aggregate(m, SLattice(bbx), sum) # warns:
Stasch et al. 2014 prove A1 and A2 more formally.
Further assertions in the poster:
Polygon data may reflect constant values over areas (geology) or means (population density), this matters for (further) aggregation or downsampling.
Package mss
distinguishes between SField
and
SLattice
to represent constant values, and aggregations,
respectively. Subsampling SField
variabls does not not warn,
area = addAttrToGeom(meuse.area, data.frame(v=1), match.ID = FALSE) f = SField(area, meuse.area) pts = SField(meuse[1:5,], meuse.area) over(pts, f) # retrieve f at the points of pts:
but subsampling SLattice
variables triggers a warning, and no
useful value
a = SLattice(area) over(pts, a)
as it should really disaggregate first, from area to point.
The same is true for grid cells.
SField
and SLattice
both generalize polygon and grid data.
f1 = SField(meuse.grid, meuse.area, cellsArePoints = TRUE) pts = SField(meuse[1:5,], meuse.area) over(pts, f1) # matches pts to grid cell center points: f2 = SField(meuse.grid, meuse.area, cellsArePoints = FALSE) over(pts, f2) # matches pts to the complete grid cell a = SLattice(meuse.grid) over(pts, a)
Point pattern data often come without observation window, over which aggregation is meaningful
Attempting to create a SObjects
object without an observation window specified gives
a warning message:
pts <- SObjects(meuse)
Providing a window that does not contain all the points gives an error:
tr = try(pts <- SObjects(meuse, meuse[1:10,])) attr(tr, "condition")
Field measurements often come without a notion for which locations interpolations based on them make sense.
m = SField(meuse["zinc"], meuse.area) a = interpolate(log(zinc)~1, m, SLattice(bbx)) # warns:
File types do not inform on meaningfulness
SpatialPoints
, SpatialPixels
, SpatialGrid
, SpatialLines
,
SpatialPolygons
and their *DataFrame
counterparts do not inform
whether data concern fields (geostatistical data), entities (point
patterns), or aggregations. The classes in mss
: SField
,
SObjects
, and SLattice
, do.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.