mlink.test: Maximum Linkage spatial scan test

View source: R/mlink.test.R

mlink.testR Documentation

Maximum Linkage spatial scan test

Description

mlink.test implements the Maximum Linkage spatial scan test of Costa et al. (2012). Starting with a single region as a current zone, new candidate zones are constructed by combining the current zone with the connected region that maximizes the resulting likelihood ratio test statistic, with the added constraint that the region has the maximum connections (i.e., shares a border with) with the regions in the current zone. This procedure is repeated until the population or distance upper bounds constraints are reached. The same procedure is repeated for each region. The clusters returned are non-overlapping, ordered from most significant to least significant. The first cluster is the most likely to be a cluster. If no significant clusters are found, then the most likely cluster is returned (along with a warning).

Usage

mlink.test(
  coords,
  cases,
  pop,
  w,
  ex = sum(cases)/sum(pop) * pop,
  nsim = 499,
  alpha = 0.1,
  ubpop = 0.5,
  ubd = 1,
  longlat = FALSE,
  cl = NULL
)

Arguments

coords

An n \times 2 matrix of centroid coordinates for the regions in the form (x, y) or (longitude, latitude) is using great circle distance.

cases

The number of cases observed in each region.

pop

The population size associated with each region.

w

A binary spatial adjacency matrix for the regions.

ex

The expected number of cases for each region. The default is calculated under the constant risk hypothesis.

nsim

The number of simulations from which to compute the p-value.

alpha

The significance level to determine whether a cluster is signficant. Default is 0.10.

ubpop

The upperbound of the proportion of the total population to consider for a cluster.

ubd

A proportion in (0, 1]. The distance of potential clusters must be no more than ubd * m, where m is the maximum intercentroid distance between all coordinates.

longlat

The default is FALSE, which specifies that Euclidean distance should be used. If longlat is TRUE, then the great circle distance is used to calculate the intercentroid distance.

cl

A cluster object created by makeCluster, or an integer to indicate number of child-processes (integer values are ignored on Windows) for parallel evaluations (see Details on performance). It can also be "future" to use a future backend (see Details), NULL (default) refers to sequential evaluation.

Details

The maximum intercentroid distance can be found by executing the command: gedist(as.matrix(coords), longlat = longlat), based on the specified values of coords and longlat.

Value

Returns a smerc_cluster object.

Author(s)

Joshua French

References

Costa, M.A. and Assuncao, R.M. and Kulldorff, M. (2012) Constrained spanning tree algorithms for irregularly-shaped spatial clustering, Computational Statistics & Data Analysis, 56(6), 1771-1783. <doi:10.1016/j.csda.2011.11.001>

See Also

print.smerc_cluster, summary.smerc_cluster, plot.smerc_cluster, scan.stat, scan.test

Examples

data(nydf)
data(nyw)
coords <- with(nydf, cbind(longitude, latitude))
out <- mlink.test(
  coords = coords, cases = floor(nydf$cases),
  pop = nydf$pop, w = nyw,
  alpha = 0.12, longlat = TRUE,
  nsim = 2, ubpop = 0.05, ubd = 0.1
)
# better plotting
if (require("sf", quietly = TRUE)) {
   data(nysf)
   plot(st_geometry(nysf), col = color.clusters(out))
}

jfrench/smerc documentation built on Oct. 27, 2024, 5:13 p.m.