Geographically weighted regression

Share:

Description

The function implements the basic geographically weighted regression approach to exploring spatial non-stationarity for given global bandwidth and chosen weighting scheme.

Usage

1
2
3
4
5
6
gwr(formula, data=list(), coords, bandwidth, gweight=gwr.Gauss, 
	adapt=NULL, hatmatrix = FALSE, fit.points, longlat=NULL, 
	se.fit=FALSE, weights, cl=NULL, predictions = FALSE, 
        fittedGWRobject = NULL, se.fit.CCT = TRUE)
## S3 method for class 'gwr'
print(x, ...)

Arguments

formula

regression model formula as in lm

data

model data frame, or SpatialPointsDataFrame or SpatialPolygonsDataFrame as defined in package sp

coords

matrix of coordinates of points representing the spatial positions of the observations; may be omitted if the object passed through the data argument is from package sp

bandwidth

bandwidth used in the weighting function, possibly calculated by gwr.sel

gweight

geographical weighting function, at present gwr.Gauss() default, or gwr.gauss(), the previous default or gwr.bisquare()

adapt

either NULL (default) or a proportion between 0 and 1 of observations to include in weighting scheme (k-nearest neighbours)

hatmatrix

if TRUE, return the hatmatrix as a component of the result, ignored if fit.points given

fit.points

an object containing the coordinates of fit points; often an object from package sp; if missing, the coordinates given through the data argument object, or the coords argument are used

longlat

TRUE if point coordinates are longitude-latitude decimal degrees, in which case distances are measured in kilometers; if x is a SpatialPoints object, the value is taken from the object itself

se.fit

if TRUE, return local coefficient standard errors - if hatmatrix is TRUE and no fit.points are given, two effective degrees of freedom sigmas will be used to generate alternative coefficient standard errors

weights

case weights used as in weighted least squares, beware of scaling issues, probably unsafe

cl

if NULL, ignored, otherwise cl must be an object describing a “cluster” created using makeCluster in the parallel package. The cluster will then be used to hand off the calculation of local coefficients to cluster nodes, if fit points have been given as an argument, and hatmatrix=FALSE

predictions

default FALSE; if TRUE and no fit points given, return GW fitted values at data points, if fit points given and are a Spatial*DataFrame object containing the RHS variables in the formula, return GW predictions at the fit points

fittedGWRobject

a fitted gwr object with a hatmatrix (optional), if given, and if fit.points are given and if se.fit is TRUE, two effective degrees of freedom sigmas will be used to generate alternative coefficient standard errors

se.fit.CCT

default TRUE, compute local coefficient standard errors using formula (2.14), p. 55, in the GWR book

x

an object of class "gwr" returned by the gwr function

...

arguments to be passed to other functions

Details

The function applies the weighting function in turn to each of the observations, or fit points if given, calculating a weighted regression for each point. The results may be explored to see if coefficient values vary over space. The local coefficient estimates may be made on a multi-node cluster using the cl argument to pass through a parallel cluster. The function will then divide the fit points (which must be given separately) between the clusters for fitting. Note that each node will need to have the “spgwr” package present, so initiating by clusterEvalQ(cl, library(spgwr)) may save a little time per node. The function clears the global environment on the node of objects sent. Using two nodes reduces timings to a little over half the time for a single node.

The section of the examples code now includes two simulation scenarios, showing how important it is to check that mapped pattern in local coefficients is actually there, rather than being an artefact.

Value

A list of class “gwr”:

SDF

a SpatialPointsDataFrame (may be gridded) or SpatialPolygonsDataFrame object (see package "sp") with fit.points, weights, GWR coefficient estimates, R-squared, and coefficient standard errors in its "data" slot.

lhat

Leung et al. L matrix

lm

Ordinary least squares global regression on the same model formula, as returned by lm.wfit().

bandwidth

the bandwidth used.

this.call

the function call used.

Author(s)

Roger Bivand Roger.Bivand@nhh.no

References

Fotheringham, A.S., Brunsdon, C., and Charlton, M.E., 2002, Geographically Weighted Regression, Chichester: Wiley; Paez A, Farber S, Wheeler D, 2011, "A simulation-based study of geographically weighted regression as a method for investigating spatially varying relationships", Environment and Planning A 43(12) 2992-3010; http://gwr.nuim.ie/

See Also

gwr.sel, gwr.gauss, gwr.bisquare

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
data(columbus)
col.lm <- lm(crime ~ income + housing, data=columbus)
summary(col.lm)
col.bw <- gwr.sel(crime ~ income + housing, data=columbus,
  coords=cbind(columbus$x, columbus$y))
col.gauss <- gwr(crime ~ income + housing, data=columbus,
  coords=cbind(columbus$x, columbus$y), bandwidth=col.bw, hatmatrix=TRUE)
col.gauss
col.d <- gwr.sel(crime ~ income + housing, data=columbus,
  coords=cbind(columbus$x, columbus$y), gweight=gwr.bisquare)
col.bisq <- gwr(crime ~ income + housing, data=columbus,
  coords=cbind(columbus$x, columbus$y), bandwidth=col.d, 
  gweight=gwr.bisquare, hatmatrix=TRUE)
col.bisq
data(georgia)
g.adapt.gauss <- gwr.sel(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + 
 PctPov + PctBlack, data=gSRDF, adapt=TRUE)
res.adpt <- gwr(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov + 
 PctBlack, data=gSRDF, adapt=g.adapt.gauss)
res.adpt
pairs(as(res.adpt$SDF, "data.frame")[,2:8], pch=".")
brks <- c(-0.25, 0, 0.01, 0.025, 0.075)
cols <- grey(5:2/6)
plot(res.adpt$SDF, col=cols[findInterval(res.adpt$SDF$PctBlack, brks,
 all.inside=TRUE)])

# simulation scenario with patterned dependent variable
set.seed(1)
X0 <- runif(nrow(gSRDF)*3)
X1 <- matrix(sample(X0), ncol=3)
X1 <- prcomp(X1, center=FALSE, scale.=FALSE)$x
gSRDF$X1 <- X1[,1]
gSRDF$X2 <- X1[,2]
gSRDF$X3 <- X1[,3]
bw <- gwr.sel(PctBach ~ X1 + X2 + X3, data=gSRDF, verbose=FALSE)
out <- gwr(PctBach ~ X1 + X2 + X3, data=gSRDF, bandwidth=bw, hatmatrix=TRUE)
out
spplot(gSRDF, "PctBach", col.regions=grey.colors(20))
spplot(gSRDF, c("X1", "X2", "X3"), col.regions=grey.colors(20))
# pattern in the local coefficients
spplot(out$SDF, c("X1", "X2", "X3"), col.regions=grey.colors(20))
# but no "significant" pattern
spplot(out$SDF, c("X1_se", "X2_se", "X3_se"), col.regions=grey.colors(20))
out$SDF$X1_t <- out$SDF$X1/out$SDF$X1_se
out$SDF$X2_t <- out$SDF$X2/out$SDF$X2_se
out$SDF$X3_t <- out$SDF$X3/out$SDF$X3_se
spplot(out$SDF, c("X1_t", "X2_t", "X3_t"), col.regions=grey.colors(20))
# simulation scenario with random dependent variable
yrn <- rnorm(nrow(gSRDF))
gSRDF$yrn <- sample(yrn)
bw <- gwr.sel(yrn ~ X1 + X2 + X3, data=gSRDF, verbose=FALSE)
# bandwidth selection maxes out at 620 km, equal to upper bound
# of line search
out <- gwr(yrn ~ X1 + X2 + X3, data=gSRDF, bandwidth=bw, hatmatrix=TRUE)
out
spplot(gSRDF, "yrn", col.regions=grey.colors(20))
spplot(gSRDF, c("X1", "X2", "X3"), col.regions=grey.colors(20))
# pattern in the local coefficients
spplot(out$SDF, c("X1", "X2", "X3"), col.regions=grey.colors(20))
# but no "significant" pattern
spplot(out$SDF, c("X1_se", "X2_se", "X3_se"), col.regions=grey.colors(20))
out$SDF$X1_t <- out$SDF$X1/out$SDF$X1_se
out$SDF$X2_t <- out$SDF$X2/out$SDF$X2_se
out$SDF$X3_t <- out$SDF$X3/out$SDF$X3_se
spplot(out$SDF, c("X1_t", "X2_t", "X3_t"), col.regions=grey.colors(20))
# end of simulations


data(meuse)
coordinates(meuse) <- c("x", "y")
meuse$ffreq <- factor(meuse$ffreq)
data(meuse.grid)
coordinates(meuse.grid) <- c("x", "y")
meuse.grid$ffreq <- factor(meuse.grid$ffreq)
gridded(meuse.grid) <- TRUE
xx <- gwr(cadmium ~ dist, meuse, bandwidth = 228, hatmatrix=TRUE)
xx
x <- gwr(cadmium ~ dist, meuse, bandwidth = 228, fit.points = meuse.grid,
 predict=TRUE, se.fit=TRUE, fittedGWRobject=xx)
x
spplot(x$SDF, "pred")
spplot(x$SDF, "pred.se")

## Not run: 
  g.bw.gauss <- gwr.sel(PctBach ~ TotPop90 + PctRural + PctEld + PctFB +
    PctPov + PctBlack, data=gSRDF)
  res.bw <- gwr(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov +
    PctBlack, data=gSRDF, bandwidth=g.bw.gauss)
  res.bw
  pairs(as(res.bw$SDF, "data.frame")[,2:8], pch=".")
  plot(res.bw$SDF, col=cols[findInterval(res.bw$SDF$PctBlack, brks,
    all.inside=TRUE)])
  g.bw.gauss <- gwr.sel(PctBach ~ TotPop90 + PctRural + PctEld + PctFB +
    PctPov + PctBlack, data=gSRDF, longlat=TRUE)
  data(gSRouter)
  require(maptools)
  SG <- GE_SpatialGrid(gSRouter, maxPixels = 100)
  SPxMASK0 <- over(SG$SG, gSRouter)
  SGDF <- SpatialGridDataFrame(slot(SG$SG, "grid"),
    data=data.frame(SPxMASK0=SPxMASK0),
    proj4string=CRS(proj4string(gSRouter)))
  SPxDF <- as(SGDF, "SpatialPixelsDataFrame")
  res.bw <- gwr(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov +
    PctBlack, data=gSRDF, bandwidth=g.bw.gauss, fit.points=SPxDF,
    longlat=TRUE)
  res.bw
  res.bw$timings
  spplot(res.bw$SDF, "PctBlack")
  require(parallel)
  cl <- makeCluster(detectCores())
  res.bwc <- gwr(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov +
    PctBlack, data=gSRDF, bandwidth=g.bw.gauss, fit.points=SPxDF,
    longlat=TRUE, cl=cl)
  res.bwc
  res.bwc$timings
  stopCluster(cl)

## End(Not run)