sp.runs.test | R Documentation |
This function compute the global spatial runs test for spatial independence of a categorical spatial data set.
sp.runs.test(formula = NULL, data = NULL, fx = NULL,
listw = listw, alternative = "two.sided" ,
distr = "asymptotic", nsim = NULL,control = list())
formula |
a symbolic description of the factor (optional). |
data |
an (optional) data frame or a sf object containing the variable to testing for. |
fx |
a factor (optional). |
listw |
A neighbourhood list (type knn or nb) or a W matrix that indicates the order of the elements in each |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". |
distr |
A string. Distribution of the test "asymptotic" (default) or "bootstrap". |
nsim |
Number of permutations to obtain pseudo-value and confidence intervals (CI). Default value is NULL to don't get CI of number of runs. |
control |
List of additional control arguments. |
The order of the neighbourhoods (m_i-environments
) is critical to obtain the test.
To obtain the number of runs observed in each m_i-environment
, each element must be associated
with a set of neighbours ordered by proximity.
Three kinds of lists can be included to identify m_i-environments
:
knn
: Objects of the class knn that consider the neighbours in order of proximity.
nb
: If the neighbours are obtained from an sf object, the code internally
will call the function nb2nb_order
it will order them in order
of proximity of the centroids.
matrix
: If a object of matrix class based in the inverse of the distance in introduced
as argument, the function nb2nb_order
will also be called internally
to transform the object the class matrix to a matrix of the class nb with ordered neighbours.
Two alternative sets of arguments can be included in this function to compute the spatial runs test:
Option 1 | A factor (fx) and a list of neighborhood (listw ) of the class knn. |
Option 2 | A sf object (data) and formula to specify the factor. A list of neighbourhood (listw) |
A object of the htest and sprunstest class
data.name | a character string giving the names of the data. |
method | the type of test applied (). |
SR | total number of runs |
dnr | empirical distribution of the number of runs |
statistic | Value of the homogeneity runs statistic. Negative sign indicates global homogeneity |
alternative | a character string describing the alternative hypothesis. |
p.value | p-value of the SRQ |
pseudo.value | the pseudo p-value of the SRQ test if nsim is not NULL |
MeanNeig | Mean of the Maximum number of neighborhood |
MaxNeig | Maximum number of neighborhood |
listw | The list of neighborhood |
nsim | number of boots (only for boots version) |
SRGP | nsim simulated values of statistic. |
SRLP | matrix with the number of runs for eacl localization. |
In this section define the concepts of spatial encoding and runs, and construct the main statistics necessary
for testing spatial homogeneity of categorical variables. In order to develop a general theoretical setting,
let us consider \{X_s\}_{s \in S}
to be the categorical spatial process of interest with Q different
categories, where S is a set of coordinates.
Spatial encoding:
For a location s \in S
denote by N_s = \{s_1,s_2 ...,s_{n_s}\}
the set of neighbours according
to the interaction scheme W, which are ordered from lesser to higher Euclidean distance with respect to location s.
The sequence as X_{s_i} , X_{s_i+1},...,, X_{s_i+r}
its elements have the same value (or are identified by the same class)
is called a spatial run at location s of length r.
The total number of runs is defined as:
SR^Q=n+\sum_{s \in S}\sum_{j=1}^{n_s}I_j^s
where I_j^s = 1 \ if \ X_{s_j-1} \neq X_{s_j} \ and 0 \ otherwise
for j=1,2,...,n_s
Following result by the Central Limit Theorem, the asymtotical distribution of SR^Q
is:
SR^Q = N(\mu_{SR^Q},\sigma_{SR^Q})
In the one-tailed case, we must distinguish the lower-tailed test and the upper-tailed, which are associated
with homogeneity and heterogeneity respectively. In the case of the lower-tailed test,
the following hypotheses are used:
H_0:\{X_s\}_{s \in S}
is i.i.d.
H_1
: The spatial distribution of the values of the categorical variable is more homogeneous than under the null hypothesis (according to the fixed association scheme).
In the upper-tailed test, the following hypotheses are used:
H_0:\{X_s\}_{s \in S}
is i.i.d.
H_1
: The spatial distribution of the values of the categorical variable is more
heterogeneous than under the null hypothesis (according to the fixed association scheme).
These hypotheses provide a decision rule regarding the degree of homogeneity in the spatial distribution
of the values of the spatial categorical random variable.
seedinit | Numerical value for the seed (only for boot version). Default value seedinit=123 |
Fernando López | fernando.lopez@upct.es |
Román Mínguez | roman.minguez@uclm.es |
Antonio Páez | paezha@gmail.com |
Manuel Ruiz | manuel.ruiz@upct.es |
Ruiz, M., López, F., and Páez, A. (2021). A test for global and local homogeneity of categorical data based on spatial runs. Working paper.
local.sp.runs.test
, dgp.spq
, Q.test
,
# Case 1: SRQ test based on factor and knn
n <- 100
cx <- runif(n)
cy <- runif(n)
x <- cbind(cx,cy)
listw <- spdep::knearneigh(cbind(cx,cy), k=3)
p <- c(1/6,3/6,2/6)
rho <- 0.5
fx <- dgp.spq(listw = listw, p = p, rho = rho)
srq <- sp.runs.test(fx = fx, listw = listw)
print(srq)
plot(srq)
# Boots Version
control <- list(seedinit = 1255)
srq <- sp.runs.test(fx = fx, listw = listw, distr = "bootstrap" , nsim = 299, control = control)
print(srq)
plot(srq)
# Case 2: SRQ test with formula, a sf object (points) and knn
data("FastFood.sf")
x <- sf::st_coordinates(sf::st_centroid(FastFood.sf))
listw <- spdep::knearneigh(x, k=4)
formula <- ~ Type
srq <- sp.runs.test(formula = formula, data = FastFood.sf, listw = listw)
print(srq)
plot(srq)
# Version boots
srq <- sp.runs.test(formula = formula, data = FastFood.sf, listw = listw,
distr = "bootstrap", nsim = 199)
print(srq)
plot(srq)
# Case 3: SRQ test (permutation) using formula with a sf object (polygons) and nb
library(sf)
fname <- system.file("shape/nc.shp", package="sf")
nc <- sf::st_read(fname)
listw <- spdep::poly2nb(as(nc,"Spatial"), queen = FALSE)
p <- c(1/6,3/6,2/6)
rho = 0.5
co <- sf::st_coordinates(sf::st_centroid(nc))
nc$fx <- dgp.spq(listw = listw, p = p, rho = rho)
plot(nc["fx"])
formula <- ~ fx
srq <- sp.runs.test(formula = formula, data = nc, listw = listw,
distr = "bootstrap", nsim = 399)
print(srq)
plot(srq)
# Case 4: SRQ test (Asymptotic) using formula with a sf object (polygons) and nb
data(provinces_spain)
# sf::sf_use_s2(FALSE)
listw <- spdep::poly2nb(provinces_spain, queen = FALSE)
provinces_spain$Coast <- factor(provinces_spain$Coast)
levels(provinces_spain$Coast) = c("no","yes")
plot(provinces_spain["Coast"])
formula <- ~ Coast
srq <- sp.runs.test(formula = formula, data = provinces_spain, listw = listw)
print(srq)
plot(srq)
# Boots version
srq <- sp.runs.test(formula = formula, data = provinces_spain, listw = listw,
distr = "bootstrap", nsim = 299)
print(srq)
plot(srq)
# Case 5: SRQ test based on a distance matrix (inverse distance)
N <- 100
cx <- runif(N)
cy <- runif(N)
data <- as.data.frame(cbind(cx,cy))
data <- sf::st_as_sf(data,coords = c("cx","cy"))
n = dim(data)[1]
dis <- 1/matrix(as.numeric(sf::st_distance(data,data)),ncol=n,nrow=n)
diag(dis) <- 0
dis <- (dis < quantile(dis,.10))*dis
p <- c(1/6,3/6,2/6)
rho <- 0.5
fx <- dgp.spq(listw = dis , p = p, rho = rho)
srq <- sp.runs.test(fx = fx, listw = dis)
print(srq)
plot(srq)
srq <- sp.runs.test(fx = fx, listw = dis, data = data)
print(srq)
plot(srq)
# Boots version
srq <- sp.runs.test(fx = fx, listw = dis, data = data, distr = "bootstrap", nsim = 299)
print(srq)
plot(srq)
# Case 6: SRQ test based on a distance matrix (inverse distance)
data("FastFood.sf")
# sf::sf_use_s2(FALSE)
n = dim(FastFood.sf)[1]
dis <- 1000000/matrix(as.numeric(sf::st_distance(FastFood.sf,FastFood.sf)), ncol = n, nrow = n)
diag(dis) <- 0
dis <- (dis < quantile(dis,.005))*dis
p <- c(1/6,3/6,2/6)
rho = 0.5
co <- sf::st_coordinates(sf::st_centroid(FastFood.sf))
FastFood.sf$fx <- dgp.spq(p = p, listw = dis, rho = rho)
plot(FastFood.sf["fx"])
formula <- ~ fx
# Boots version
srq <- sp.runs.test(formula = formula, data = FastFood.sf, listw = dis,
distr = "bootstrap", nsim = 299)
print(srq)
plot(srq)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.