mlf.zones: Determine zones for the maxima likelihood first algorithm.
In jfrench/smerc: Statistical Methods for Regional Counts

mlf.zones

R Documentation

Determine zones for the maxima likelihood first algorithm.

Description

mlf.zones determines the most likely cluster zone obtained by implementing the maxima likelihood first scann method of Yao et al. (2011). Note that this is really just a special case of the dynamic minimum spanning tree (DMST) algorithm of Assuncao et al. (2006)

Usage

mlf.zones(
  coords,
  cases,
  pop,
  w,
  ex = sum(cases)/sum(pop) * pop,
  ubpop = 0.5,
  ubd = 1,
  longlat = FALSE
)

Arguments

`coords`	An `n \times 2` matrix of centroid coordinates for the regions in the form (x, y) or (longitude, latitude) is using great circle distance.
`cases`	The number of cases observed in each region.
`pop`	The population size associated with each region.
`w`	A binary spatial adjacency matrix for the regions.
`ex`	The expected number of cases for each region. The default is calculated under the constant risk hypothesis.
`ubpop`	The upperbound of the proportion of the total population to consider for a cluster.
`ubd`	A proportion in (0, 1]. The distance of potential clusters must be no more than `ubd * m`, where `m` is the maximum intercentroid distance between all coordinates.
`longlat`	The default is `FALSE`, which specifies that Euclidean distance should be used. If `longlat` is `TRUE`, then the great circle distance is used to calculate the intercentroid distance.

Details

Each step of the mlf scan test seeks to maximize the likelihood ratio test statistic used in the original spatial scan test (Kulldorff 1997). The first zone considered is the region that maximizes this likelihood ration test statistic, providing that no more than ubpop proportion of the total population is in the zone. The second zone is the first zone and the connected region that maximizes the scan statistic, subject to the population and distance constraints. This pattern continues until no additional zones can be added due to population or distance constraints.

Every zone considered must have a total population less than ubpop * sum(pop) in the study area. Additionally, the maximum intercentroid distance for the regions within a zone must be no more than ubd * the maximum intercentroid distance across all regions.

Value

Returns a list with elements:

`zones`	A list contained the location ids of each potential cluster.
`loglikrat`	The loglikelihood ratio for each zone (i.e., the log of the test statistic).
`cases`	The observed number of cases in each zone.
`expected`	The expected number of cases each zone.
`pop`	The total population in each zone.

Author(s)

Joshua French

References

Yao, Z., Tang, J., & Zhan, F. B. (2011). Detection of arbitrarily-shaped clusters using a neighbor-expanding approach: A case study on murine typhus in South Texas. International Journal of Health Geographics, 10(1), 1.

Examples

data(nydf)
data(nyw)
coords <- as.matrix(nydf[, c("x", "y")])
mlf.zones(coords,
  cases = floor(nydf$cases),
  pop = nydf$pop, w = nyw, longlat = TRUE
)

jfrench/smerc documentation built on Oct. 27, 2024, 5:13 p.m.