edmst.zones: Determine zones for the early stopping dynamic Minimum...

View source: R/edmst.zones.R

edmst.zonesR Documentation

Determine zones for the early stopping dynamic Minimum Spanning Tree scan test

Description

edmst.zones determines the zones for the early stopping Dynamic Minimum Spanning Tree scan test (edmst.test). The function returns the zones, as well as the associated test statistic, cases in each zone, the expected number of cases in each zone, and the population in each zone.

Usage

edmst.zones(
  coords,
  cases,
  pop,
  w,
  ex = sum(cases)/sum(pop) * pop,
  ubpop = 0.5,
  ubd = 1,
  longlat = FALSE,
  cl = NULL,
  progress = TRUE
)

Arguments

coords

An n \times 2 matrix of centroid coordinates for the regions in the form (x, y) or (longitude, latitude) is using great circle distance.

cases

The number of cases observed in each region.

pop

The population size associated with each region.

w

A binary spatial adjacency matrix for the regions.

ex

The expected number of cases for each region. The default is calculated under the constant risk hypothesis.

ubpop

The upperbound of the proportion of the total population to consider for a cluster.

ubd

A proportion in (0, 1]. The distance of potential clusters must be no more than ubd * m, where m is the maximum intercentroid distance between all coordinates.

longlat

The default is FALSE, which specifies that Euclidean distance should be used. If longlat is TRUE, then the great circle distance is used to calculate the intercentroid distance.

cl

A cluster object created by makeCluster, or an integer to indicate number of child-processes (integer values are ignored on Windows) for parallel evaluations (see Details on performance). It can also be "future" to use a future backend (see Details), NULL (default) refers to sequential evaluation.

progress

A logical value indicating whether a progress bar should be displayed. The default is TRUE.

Details

Every zone considered must have a total population less than ubpop * sum(pop). Additionally, the maximum intercentroid distance for the regions within a zone must be no more than ubd * the maximum intercentroid distance across all regions.

Value

Returns a list with elements:

zones

A list contained the location ids of each potential cluster.

loglikrat

The loglikelihood ratio for each zone (i.e., the log of the test statistic).

cases

The observed number of cases in each zone.

expected

The expected number of cases each zone.

pop

The total population in each zone.

Author(s)

Joshua French

References

Costa, M.A. and Assuncao, R.M. and Kulldorff, M. (2012) Constrained spanning tree algorithms for irregularly-shaped spatial clustering, Computational Statistics & Data Analysis, 56(6), 1771-1783. <doi:10.1016/j.csda.2011.11.001>

Examples

data(nydf)
data(nyw)
coords <- as.matrix(nydf[, c("longitude", "latitude")])
# find zone with max statistic starting from each individual region
all_zones <- edmst.zones(coords,
  cases = floor(nydf$cases),
  nydf$pop, w = nyw, ubpop = 0.25,
  ubd = .25, longlat = TRUE
)

jpfrench81/smerc documentation built on Jan. 13, 2024, 4:30 a.m.