Create an 'aurelhy' object that contains required data to perform AURELHY interpolation

Share:

Description

An 'aurelhy' object contains principal components calculated after the various variables describing the landscape, as well as other useful descriptors. Use the predict() method to interpolate some data with the AURELHY method.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
aurelhy(geotm, geomask, landmask = auremask(), x0 = 30, y0 = 30, step = 12,
    nbr.pc = 10, scale = FALSE, model = "data ~ .", vgmodel = vgm(1, "Sph", 10, 1),
    add.vars = NULL, var.name = NULL, resample.geomask = TRUE)

## S3 method for class 'aurelhy'
print(x, ...)
## S3 method for class 'aurelhy'
plot(x, y, main = "PCA on land descriptors", ...)
## S3 method for class 'aurelhy'
points(x, pch = ".", ...)
## S3 method for class 'aurelhy'
summary(object, ...)
## S3 method for class 'aurelhy'
update(object, nbr.pc, scale, model, vgmodel, ...)
## S3 method for class 'aurelhy'
predict(object, geopoints, variable, v.fit = NULL, ...)

## S3 method for class 'predict.aurelhy'
print(x, ...)
## S3 method for class 'predict.aurelhy'
summary(object, ...)
## S3 method for class 'predict.aurelhy'
plot(x, y, which = 1, ...)

## S3 method for class 'aurelhy'
as.geomat(x, what = "PC1", nodata = NA, ...) 
## S3 method for class 'predict.aurelhy'
as.geomat(x,
    what = c("Interpolated", "Predicted", "KrigedResiduals", "KrigeVariance"),
	nodata = NA,...)

Arguments

geotm

a terrain model ('geotm' object) with enough resolution to be able to calculate all landscape descriptors (use print() or plot() methods of an 'auremask' object used to calculate landscape descriptors to check your terrain model is dense enough).

geomask

a 'geomask' object with same resolution and coverage of the 'geotm' object or the final interpolation grid (with resample.geomask = FALSE), and indicating which points should be considered for the interpolation (note that your terrain model must be larger than the targetted area by, at least, maximum distance of the mask in all directions in order to be able to calculate landscape descriptors for all considered points).

landmask

an 'auremask' object that defines the window of analysis used around each point ot calculate its landscape descriptors.

x0

shift in X direction (longitude) where to consider the first point of the interpolation grid (note that interpolation grid must be less dense or equal to the terrain model grid, depending on the mask used).

y0

shift in Y direction (latitude) for the first point of the interpolation grid.

step

resolution of the interpolation grid, i.e., we keep one point every step points from the original grid of the terrain model for constructing the interpolation grid. step must be a single integer larger or equal to one (may be equal to one only with rectangular 'auremask' objects).

nbr.pc

number of PCA's principal components to keep in the interpolation. This is the initial value; the example show you how you can change this after the 'aurelhy' object is calculated.

scale

should we scale the landscape descriptors (variance = 1) before performing the PCA? If scale = FALSE (by default), a PCA is run on the variance-covariance matrix (no scaling), otherwise, the PCA is run on the correlation matrix.

model

a formula describing the model used to predict the data. The left-hand side of the formula must always be 'data' and the right-hand considers all predictors spearated by a plus sign. To use all predictors, specify data ~ . (by default).

vgmodel

the variogram model to fit, as defined by the vgm() function of the gstat package

add.vars

additional variable(s) measured at the same points as the geotm object, or the final interpolation grid. They will be used as additional predictors. The example show you how you can add or remove such variables after the 'aurelhy' object is calculated. If NULL (by default), no additional variables will be used.

var.name

if add.vars is a 'geomat' object, you can give the name you want to use for this predictor here.

resample.geomask

do we resample the geomask using x0, y0 and step to get the final mask of calculated points? If TRUE (by default), geomask should have the same grid as geotm. Otherwise, the geomask must exactly match the points where aurelhy should perform the interpolation. The default value allows for a backward-compatible behaviour of the function (aurelhy version =< 1.0-2).

x

an 'aurelhy' or 'predict.aurelhy' object, depending on the method invoked

y

a 'geopoints' object to create a plot best depicting the interpolation process, or nothing to just plot the interpolation grid.

main

the main title of the graph

pch

the symbol to use for plotting points. The default value, pch = "." prints a small (usually one pixel size) square

object

an 'aurelhy' object

geopoints

a 'geopoints' object with data to be interpolated.

variable

the name of the variable in the 'geopoints' object to interpolate

v.fit

the fitted variogram model used to krige residuals. If NULL (by default), a fitted model for the variogram is calculated, starting from the model provided in the 'aurelhy' object, 'vgm' slot. If FALSE, residuals are not kriged (useful, e.g., to save calculation time when one look for best predictors in the regression)

which

which graph to plot

what

what is extracted as a 'geomat' object

nodata

the code used to represent missing data in the 'geomat' object

...

further arguments passed to the function

Details

aurelhy() creates a new 'aurelhy' object. The object has print() and plot() methods for further diagnostics. You should use the predict() method to perform the AURELHY interpolation on some data. The 'aurelhy' object is also easy to save for further reuse (it is designed so that the most time-consumming operations are done during its creation; so, it is supposed to be generated only once and reused for different interpolations on the same terrain model).

Value

An 'aurelhy' object with all information required to perform an AURELHY interpolation with any 'geopoints' data.

Author(s)

Philippe Grosjean <phgrosjean@sciviews.org>

Source

Benichou P, Le Breton O (1987). Prise en compte de la topographie pour la cartographie des champs pluviometriques statistiques. La Meteorologie, 7:23-34.

See Also

geotm, auremask

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
# Create an aurelhy object for the Morocco terrain data
data(morocco)  # The terrain model with a grid of about 924x924m
data(mbord)    # A shape with the area around Morocco to analyze
data(mmask)    # A 924x924m grid with a mask covering territory to analyze
data(mseadist) # The distance to the sea for territory to analyze
data(mrain)    # Rain data measured at 43 stations to be interpolated

# Create a map with these data
image(morocco) # Plot the terrain model
grid()
lines(mbord, col = "red") # Add borders of territory to analyze in red

# Make sure we use all the stations from mrain in the geomask
mmask2 <- add.points(mmask, mrain)

# Now, create an aurelhy object with landscape description, using the
# first ten PCs, plus the distance to the sea (mseadist) for prediction
# Use a default radial window of analysis of 26km as maximum distance
# and an interpolation grid of 0.1x0.1degrees (roughly 11x11km)
# The variogram model is kept simple here, see ?vgm for other choices
# Be patient... this takes a little time to calculate!
maurelhy <- aurelhy(morocco, mmask2, auremask(), x0 = 30, y0 = 54, step = 12,
    scale = TRUE, nbr.pc = 10, vgmodel = vgm(100, "Sph", 1, 1),
    add.vars = mseadist, var.name = "seadist")    
maurelhy
points(maurelhy) # Add the interpolated points on the map
points(mrain, col = "red") # Add location of weather stations in red

# Diagnostic of the PCA on land descriptors
summary(maurelhy)
plot(maurelhy)

# Interpolate 'rain' variable on the considered territory around Morocco
# Since we do not want negative values for this variable and it is log-normally
# distributed, we will interpolate log10(rain) instead
mrain$logRain <- log10(mrain$rain)
pmrain <- predict(maurelhy, mrain, "logRain")
pmrain

# Diagnostic of regression model
summary(pmrain) # Significant predictors at alpha = 0.01 are x, y, PC3, PC6 and PC7
# one could simplify the model as data ~ x + y + PC3 + PC6 + PC7
# but it is faster to keep the full model for final interpolation
# when we are only interested by the final interpolation or when processing
# is automated...
# Any of the predictors can be extracted from maurelhy as a geomat object
# for further inspection. For instance, let's look at PC3, PC6 and PC7 components
persp(as.geomat(maurelhy, "PC3"), expand = 50)
persp(as.geomat(maurelhy, "PC6"), expand = 50)
persp(as.geomat(maurelhy, "PC7"), expand = 50)

plot(pmrain, which = 1) # Residuals versus fitted (how residuals spread?)
plot(pmrain, which = 2) # Normal Q-Q plot of residuals (residuals distribution)
plot(pmrain, which = 3) # Best graph to look at residuals homoscedasticity
plot(pmrain, which = 4) # Cook's distance of residuals versus observation
plot(pmrain, which = 5) # Residuals leverage: are there influencial points?
# Map of predicted values
filled.contour(as.geomat(pmrain, "Predicted"), asp = 1,
    color.palette = terrain.colors, main = "Values predicted by the linear model")

# Residuals kriging diagnostic
plot(pmrain, which = 6) # Semi-variogram and adjusted model
filled.contour(as.geomat(pmrain, "KrigedResiduals"), asp = 1,
    color.palette = terrain.colors, main = "Kriged residuals")
filled.contour(as.geomat(pmrain, "KrigeVariance"), asp = 1,
    color.palette = terrain.colors, main = "Kriged residuals variance")
# As we can expect, kriging variance is larger in the south/south-west part
# where density of stations is low

# AURELHY interpolation diagnostic plots
# Graph showing the importance of predicted versus kriged residuals for
# all observations
plot(pmrain, which = 7)  # Model prediction and kriged residuals

# Extract interpolated log(rain) and transform back into rain (mm)
geomrain <- as.geomat(pmrain)
geomrain <- 10^geomrain

# How is interpolated rain distributed?
range(geomrain, na.rm = TRUE)
range(mrain$rain)

# Ooops! We have some very high values! How many?
sum(geomrain > 1000, na.rm = TRUE)
# This is probably due to a lack of data at high altitudes
# Let's truncate them to 1000 for a better graph
geomrain[geomrain > 1000] <- 1000
# ... and plot the result
image(geomrain, col = topo.colors(12))
contour(geomrain, add = TRUE)
lines(mbord, col = "red")
points(mrain, col = "red")

# A better plot for these interpolated rain data
filled.contour(coords(geomrain, "x"), coords(geomrain, "y"), geomrain,
    asp = 1, xlab = "Longitude", ylab = "Latitude",
    main = "AURELHY interpolated rain data", key.title = title(sub = "Rain (mm)\n\n"), 
    color.palette = colorRampPalette(c("red", "gray", "blue"), bias = 2))


# One can experiment different interpolation parameters using update()
# Suppose we (1) don't want to scale PCs, (2) to keep only first 7 PCs,
# (3) we want an upgraded linear model like this:
# data <- a.x + b.y + c.z + d.PC3 + e.PC6 + f.PC7 + g.seadist + h.seadist^2 + i
# and (4) we want a Gaussian model for the semi-variogram
# (note that one can also regress against seadist2 <- seadist^2), just do:
# maurelhy2$seadist2 <- maurelhy$seadist^2
# Even with all these changes, you don't have to recompute maurelhy,
# just update() it and the costly steps of calculating landscape descriptors
# are reused (not the use of I() to protect calcs inside a formula)!
maurelhy2 <- update(maurelhy, scale = FALSE, nbr.pc = 7,
  model = data ~ x + y + z + PC3 + PC6 + PC7 + seadist + I(seadist^2),
  vgmodel = vgm(100, "Gau", 1, 1))
maurelhy2

# Diagnostic of the new PCA on land descriptors without scaling
summary(maurelhy2)
plot(maurelhy2)

# Interpolate with the new parameters
pmrain2 <- predict(maurelhy2, mrain, "logRain")
summary(pmrain2)

# A couple of graphs
plot(pmrain2, which = 1) # Residuals versus fitted (how residuals spread?)
plot(pmrain, which = 6) # Semi-variogram and adjusted model
plot(pmrain2, which = 7) # Model prediction and kriged residuals

#... Explore as much as you like until you find the set of parameters that suits you!