knitr::opts_chunk$set(echo = TRUE)
library(spdep) library(spatialreg) library(sf) library(ggplot2)
This guide show the functionalities of the spqdep package to test spatial dependence on qualitative dataset.
Two data sets will be used as examples in this guide:
provinces_spain: The division of Spain into provinces. It is a multypolygon geometry with isolated provinces (islands without neighbouring provinces). See by example Paez et al. 2021.
FastFood.sf: The data set used as example in @ruiz2010. It is a geometry of points.
The package is install like usual and the dataset can be loaded using the next code
library(spqdep) data("provinces_spain", package = "spqdep") data("FastFood.sf", package = "spqdep")
Additional to the two dataset available in the spqdep package. The user can generate structured spatial processes using the \code{dgp.spq()} function. The DGP generate with this function defined in @ruiz2010.
The next code show how to generate a random process on a set of random points localized in a square 1x1. In this case, the connectivity criteria is based on the 4 near neighborhood.
set.seed(123) N <- 100 cx <- runif(N) cy <- runif(N) coor <- cbind(cx,cy) p <- c(1/6,3/6,2/6) rho = 0.5 listw <- spdep::nb2listw(knn2nb(knearneigh(coor, k = 4))) fx <- dgp.spq(list = listw, p = p, rho = rho)
The next plot show the qualitative spatial process defined.
ggplot(data.frame(fx = fx, cx = cx, cy = cy), aes(x = cx, y = cy, color = fx)) + geom_point(size = 6) + theme_bw()
The Q-test [@ruiz2010] is based on m-surroundings
Before to apply the Q-test it is necessary define a set of the m-surroundings
The \code{m.surround()} function generate a set of m-surrounding.
The user can tuning several parameters to obtain a congruent set of m-surroundings.
m.surround() is the function to generate m-surroundings.
The output of this function is a object of the class m_surr
Using the \code{plot()} method the user can explore the coherence of m-surroundings.
By example. the next code obtain m-surroundings with length m = 3 and degree of overlapping r = 1:
m = 3 r = 1 mh <- m.surround(x = cbind(cx,cy), m = m, r = r) class(mh)
The spqdep have three methods that can be apply to this class: \code{print()}, \code{summary} and \code{plot}
print(mh)
summary(mh)
plot(mh, type = 1)
By example, with control argument, the user can 'prune' non-coherent m-surroundings.
control <- list (dtmaxknn = 10) mh.prune <- m.surround(x = coor, m = m, r = r, control = control) plot(mh.prune)
The user must select the longitude of the m-surroundings (m) and the overlapping degree (r). In the next code example, the Q-test is obtain for the DGP spatial process (fx) obtain with the \code{dgp.spq()}. The coordinates coor must be included as argument.
q.test <- Q.test(fx = fx, coor = coor, m = 3, r = 1)
The output is a list with the result for symbols based on permutations (standard) and combinations (equivalent).
The output of this function is an object of the spqtest class.
The asymptotic distribution is the default distribution to obtain the significance of Q-test [@ruiz2010].
Alternatively, the Monte Carlo method can be used to obtain the significance of the test. The paper @lopez2012 describe this approach.
q.test.mc <- Q.test(fx = fx, coor = coor, m = 3, r = 1, distr = "mc") summary(q.test.mc)
A summary can be apply to an object of the spqtest class:
summary(q.test)
The histogram of the number of symbols is obtain appling the plot method.
plot(q.test)
# Case 3: With a sf object with isolated areas sf_use_s2(FALSE) provinces_spain$Male2Female <- factor(provinces_spain$Male2Female > 100) levels(provinces_spain$Male2Female) = c("men","woman") f1 <- ~ Male2Female q.test.sf <- Q.test(formula = f1, data = provinces_spain, m = 3, r = 1)
plot(q.test.sf)
The next code generate two qualitative spatial process with different levels of spatial dependence and the Q-Map is apply.
p <- c(1/6,3/6,2/6) rho = 0.5 QY1 <- dgp.spq(p = p, listw = listw, rho = rho) rho = 0.8 QY2 <- dgp.spq(p = p, listw = listw, rho = rho) dt = data.frame(QY1,QY2) m = 3 r = 1 formula <- ~ QY1 + QY2 control <- list(dtmaxknn = 10) qmap <- Q.map.test(formula = formula, data = dt, coor = coor, m = m, r = r, type ="combinations", control = control)
print(qmap[[1]])
plot(qmap, ci=.6)
The runs test [@ruiz2021] have global and local versions
listw <- knearneigh(coor, k = 3) srq <- sp.runs.test(fx = fx, listw = listw)
print(srq)
plot(srq)
lsrq <- local.sp.runs.test(fx = fx, listw = listw, alternative = "less")
print(lsrq)
plot(lsrq, sig = 0.05)
data("provinces_spain") listw <- spdep::poly2nb(as(provinces_spain,"Spatial"), queen = FALSE) provinces_spain$Male2Female <- factor(provinces_spain$Male2Female > 100) levels(provinces_spain$Male2Female) = c("men","woman") plot(provinces_spain["Male2Female"]) formula <- ~ Male2Female # Boots Version lsrq <- local.sp.runs.test(formula = formula, data = provinces_spain, listw = listw, distr ="bootstrap", nsim = 199) plot(lsrq, sf = provinces_spain, sig = 0.10)
Two of the scan tests to identify clusters can be apply to test spatial structure in qualitative spatial processes.
The scan test don't need pre-define the classical W conectivity matrix.
See @Kanaroglou2016
The scan tests contrasts the null of independence of a spatial qualitative process and give additional information indicating one (or perhaps more) spatial cluster(s).
The scan tests don't have asymptotic distribution. The significance is obtained by permutational resampling.
The output of the scan function is an object of the classes scantest and htest
formula <- ~ Male2Female scan.spain <- spqdep::scan.test(formula = formula, data = provinces_spain, case="men", nsim = 99, distr = "bernoulli") print(scan.spain)
listw <- spdep::poly2nb(provinces_spain, queen = FALSE) scan.spain <- spqdep::scan.test(formula = formula, data = provinces_spain, case="men", nsim = 99, windows = "flexible", listw = listw, nv = 6, distr = "bernoulli") print(scan.spain)
data(FastFood.sf) formula <- ~ Type scan.fastfood <- scan.test(formula = formula, data = FastFood.sf, nsim = 99, distr = "multinomial", windows = "elliptic", nv = 50) print(scan.fastfood)
summary(scan.fastfood)
plot(scan.spain, sf = provinces_spain)
plot(scan.fastfood, sf = FastFood.sf)
The Farber et al. (2014) paper develop the similarity test
The \code{similarity.test()} function calculates the similarity test for both asymptotic distribution and permutational resampling.
coor <- st_coordinates(st_centroid(FastFood.sf)) listw <- spdep::knearneigh(coor, k = 4) formula <- ~ Type similarity <- similarity.test(formula = formula, data = FastFood.sf, listw = listw) print(similarity)
provinces_spain$Older <- cut(provinces_spain$Older, breaks = c(-Inf,19,22.5,Inf)) levels(provinces_spain$Older) = c("low","middle","high") f1 <- ~ Older + Male2Female jc1 <- jc.test(formula = f1, data = provinces_spain, distr = "asymptotic", alternative = "greater", zero.policy = TRUE) summary(jc1)
jc1 <- jc.test(formula = f1, data = provinces_spain, distr = "mc", alternative = "greater", zero.policy = TRUE) summary(jc1)
Farber, S., Marin, M. R., & Páez, A. (2015). Testing for spatial independence using similarity relations. Geographical Analysis, 47(2), 97-120.
Paez, A., Lopez, F. A., Menezes, T., Cavalcanti, R., & Pitta, M. G. D. R. (2021). A spatio‐temporal analysis of the environmental correlates of COVID‐19 incidence in Spain. Geographical analysis, 53(3), 397-421.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.