The paper 'Wheeler, A. P., Steenbeek, W., & Andresen, M. A. (2018). Testing for similarity in area‐based spatial patterns: Alternative methods to Andresen's spatial point pattern test. Transactions in GIS, 22(3), 760-774' discusses several limitations of Andresen's original SPPT. In summary, these limitations are:
One must arbitrarily choose which point pattern is the base and test dataset, but results are not necessarily invariant to this choice. "For example, if the test set has a much larger sample size, it will result in smaller confidence intervals and, subsequently, more areas may be marked as statistically different. If you switched the analysis so that the smaller sample size was the test set, the intervals around the proportions would be much larger and, thus, the researcher may conclude the patterns are more similar." (Wheeler et al. 2018)
Areas with zero points in the test set will always have a confidence interval from 0% to 0%. The test will subsequently conclude patterns are different, even if in the base set the proportion is infinitesimally small (but non-zero).
The original SPPT only defines an area as "different" (no/yes) based on statistical significance, but there is no estimate of the size of the difference. However, differences may be statistically significant but substantively trivial.
library("sppt")
As a simple example of the consequences of these limitations, consider the toy data used in other vignettes (for illustrative purposes we first remove points that are outside the areal units):
plot(areas.sp) points1 <- points1.sp[areas.sp, ] points2 <- points2.sp[areas.sp, ] points(points1, col="blue", pch = 19) points(points2, col="red", pch = 15) text(coordinates(areas.sp), label = areas.sp$ID, font = 2)
Let's now run the R sppt
function with standard settings:
set.seed(39346) # set seed for reproducibility toy.sp <- sppt(points1, points2, areas.sp)
Let's now run the R sppt
function with standard settings, but with reversed Base and Test patterns:
set.seed(39346) # set seed for reproducibility toy2.sp <- sppt(points2, points1, areas.sp)
Show the output of both:
toy.sp@data toy2.sp@data
Although the order of variables are switched, it's easy to see all variables up to and including PctTstPts are similar. However, the confidence intervals are different, leading to different localS
and similarity
values, in turn producing a very different globalS
! This is definitely not the behavior we'd want from the test.
Realizing that SPPT is a procedure that aims to test the difference in proportions, Wheeler et al. (2018) propose to use a proportion difference test per areal unit.
In fact, the package sppt
>= 0.1.5 has two new functions:
sppt_boot()
, that calculate differences in proportions using a bootstrap resampling procedure for both point datasets.
sppt_diff()
, that implements the direct tests with a variety of methods. For both, a multiple comparisons correction is possible based on (e.g.) the false discovery rate (the default).
sppt_boot()
differs from the original sppt()
in that it does a resampling procedure on both Base and Test data. This function takes a random sample from Base and Test for each areal unit, calculates the proportion of sampled Base and Test points in each areal unit, and then the difference between the two percentages is calculated immediately (and this is done nsamples times for all areal units). If the 95% distribution of the differences between (resampled) Base and Test percentages excludes 0, they are deemed statistically significantly different from each other.
The standard settings of bootstrap = TRUE
make this a real bootstrap procedure: (a) for each sampling loop (nsamples of times) as many points are sampled as present in the data (so no longer a 85% selection!); (b) points are sampled WITH replacement. Using this bootstrapping method the researcher no longer needs to make a choice of (85%) percpoints
. If one does want to mimic the behavior of standard sppt()
but with the added improvement that the choice of Base and Test data does not affect results, set percpoints
to 85
and bootstrap
to FALSE
.
The bootstrapping procedure does NOT solve the issue of having 0 points in an areal unit. As points are not moved around in space in each resample, areal units without any points will always have 0 points in every sample. The next function, sppt_diff()
, does not suffer from this shortcoming.
As opposed to the traditional SPPT, this test calculates the difference in the proportions using either Chi-square proportions tests or Fisher's exact test. By default, the p-values are adjusted for multiple comparisons using the false discovery rate following Benjamini & Yekutieli (2001).
We first use the data already used in vignette sppt
:
if(!require(rgdal)) install.packages("rgdal", repos = "https://cloud.r-project.org/") burglary2003 <- rgdal::readOGR(dsn = "../inst/extdata", layer = "Vancouver_Residential_Burglary_2003") # The shapefiles are found in the "inst/extdata" folder within the package burglary2016 <- rgdal::readOGR(dsn = "../inst/extdata", layer = "Vancouver_Residential_Burglary_2016") das <- rgdal::readOGR(dsn = "../inst/extdata", layer = "Vancouver_DAs_2011_Census_UTMz10")
A map of the point patterns:
plot(das) points(burglary2003, col="blue", pch=16, cex=.3) points(burglary2016, col="red", pch=16, cex=.3)
Using the default values for sppt()
, sppt_boot()
, and sppt_diff
with Chi-Square N-1 correction without p-value adjustment for multiple comparisons:
set.seed(547347) myoutput_sppt <- sppt(burglary2003, burglary2016, das) set.seed(547347) myoutput_boot <- sppt_boot(burglary2003, burglary2016, das) set.seed(547347) myoutput_diff <- sppt_diff(burglary2003, burglary2016, das, adj="none")
The global S-values are:
mean(myoutput_sppt$globalS) mean(myoutput_boot$globalS) mean(myoutput_diff$globalS)
The differences between the original sppt()
and the new functions are massive!
It is difficult to interpret the differences reported above because the points datasets differ a lot in size; there are areal units with 0 points; and there are also many areal units with only a few points. For a better comparison, we hack some simulated data together:
if(!require(maptools)) install.packages("maptools", repos = "https://cloud.r-project.org/") if(!require(spatstat)) install.packages("spatstat", repos = "https://cloud.r-project.org/")
library("maptools", quietly = TRUE) library("spatstat", quietly = TRUE)
# simulate point patterns das.owin <- as.owin(das) west <- das.owin$xrange[1] east <- das.owin$xrange[2] # 100k points, so the next lines take a while set.seed(67337) pts1 <- spatstat::rpoint(100000, function(x, y, ...){ 100 * (east - x) }, win = das.owin) pts2 <- spatstat::rpoint(100000, function(x, y, ...){ 100 * (east - x) }, win = das.owin) # convert back to SpatialPoints pts1 <- as(pts1, "SpatialPoints") proj4string(pts1) <- proj4string(das) pts2 <- as(pts2, "SpatialPoints") proj4string(pts2) <- proj4string(das)
The resulting point patterns are both concentrated mostly in the western part of Vancouver.
if(!require(scales)) install.packages("scales", repos = "https://cloud.r-project.org/") plot(das) points(pts1, col=scales::alpha("blue", .3), pch=16, cex=.1) points(pts2, col=scales::alpha("red", .3), pch=16, cex=.1)
For the purposes of this test, we remove areal units that have fewer than 100 points in either Base or Test (and the points that afterwards no longer have an associated areal unit):
if(!require(rgeos)) install.packages("rgeos", repos = "https://cloud.r-project.org/") # Select the areal units containing 100 or more points (of both Base and Test) das2 <- das[as.numeric(which(colSums(rgeos::gContains(das, pts1, byid = TRUE)) >= 100)), ] das2 <- das2[as.numeric(which(colSums(rgeos::gContains(das2, pts2, byid = TRUE)) >= 100)), ] # Select only points within these areal units pts1.sp <- pts1[das2, ] pts2.sp <- pts2[das2, ]
# Plot plot(das2) points(pts1.sp, col=scales::alpha("blue", .3), pch=16, cex=.1) points(pts2.sp, col=scales::alpha("red", .3), pch=16, cex=.1)
Any difference between the two point patterns is due to random variation, and so we'd like to see the globalS values to be quite high, indicating stability. We use the default values for sppt(), sppt_boot(), and Chi-Square with Yates's correction with and without p-value adjustment for multiple comparisons, as well as Fisher's exact test with and without p-value adjustment for multiple comparisons:
set.seed(547347) myoutput_sppt <- sppt(pts1.sp, pts2.sp, das2) set.seed(547347) myoutput_boot <- sppt_boot(pts1.sp, pts2.sp, das2) set.seed(547347) myoutput_diff <- sppt_diff(pts1.sp, pts2.sp, das2, adj = "none") set.seed(547347) myoutput_diff_adj <- sppt_diff(pts1.sp, pts2.sp, das2) set.seed(547347) myoutput_fisher <- sppt_diff(pts1.sp, pts2.sp, das2, test = "Fisher", adj = "none") set.seed(547347) myoutput_fisher_adj <- sppt_diff(pts1.sp, pts2.sp, das2, test = "Fisher")
The global S-values are:
mean(myoutput_sppt$globalS) mean(myoutput_boot$globalS) mean(myoutput_diff$globalS) mean(myoutput_diff_adj$globalS) mean(myoutput_fisher$globalS) mean(myoutput_fisher_adj$globalS)
Some observations:
The differences between sppt()
and the two new functions are substantial. By only resampling the Test points set, sppt()
is much more likely to report the proportion of points as dissimilar. We conclude that outcomes of the original SPPT may have grossly overestimated the dissimilarity between point patterns.
Because with a large number of tests the probability of finding a false positive (i.e. an area being flagged as having a dissimilar proportion of points whereas they are actually similar) is increased by chance aone, by default sppt_diff()
adjusts for multiple comparisons. Because in that case fewer areal units may get -1 or +1 local S values, the (default) correction for multiple comparisons further increases the global S value.
By not having 'empty' areal units and having at least 100 points per unit, the sppt_boot()
and sppt_diff()
are quite similar. However, because of the advantages of sppt_diff()
when the number of points per areal unit is small or zero, our advice is to use this function in the future.
Agresti, A. (2003). Dealing with discreteness: Making ‘exact’ confidence intervals for proportions, differences of proportions, and odds ratios more exact. Statistical Methods in Medical Research, 12, 3-21.
Agresti, A., & Caffo, B. (2000). Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. The American Statistician, 54, 280-288.
Benjamini, Y., and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29: 1165–1188.
Campbell, I. (2007). Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations. Statistics in Medicine, 26, 3661-3675.
Fisher, R.A. (1925). Statistical methods for research workers. Oliver and Boyd: Edinburgh
Wheeler, A. P., Steenbeek, W., & Andresen, M. A. (2018). Testing for similarity in area‐based spatial patterns: Alternative methods to Andresen's spatial point pattern test. Transactions in GIS, 22(3), 760-774.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.