optimal_ubpop: Optimal Population Upper Bound Statistics

View source: R/optimal_ubpop.R

optimal_ubpopR Documentation

Optimal Population Upper Bound Statistics

Description

optimal_ubpop computes statistics for choosing an optimal population upper bound. ubpop_seq is a sequence of values to consider as the optimal choice of upper bound. The smallest value must be at least min(pop)/sum(pop) and should generally be less than or equal to 0.5.

Usage

optimal_ubpop(
  coords,
  cases,
  pop,
  ex = sum(cases)/sum(pop) * pop,
  nsim = 499,
  alpha = 0.05,
  ubpop_seq = seq(0.01, 0.5, len = 50),
  longlat = FALSE,
  cl = NULL,
  type = "poisson",
  min.cases = 0,
  simdist = "multinomial"
)

Arguments

coords

An n \times 2 matrix of centroid coordinates for the regions in the form (x, y) or (longitude, latitude) is using great circle distance.

cases

The number of cases observed in each region.

pop

The population size associated with each region.

ex

The expected number of cases for each region. The default is calculated under the constant risk hypothesis.

nsim

The number of simulations from which to compute the p-value.

alpha

The significance level to determine whether a cluster is signficant. Default is 0.10.

ubpop_seq

A strictly increasing numeric vector with values between min(pop)/sum(pop) and 1. The default is seq(0.01, 0.5, len = 50).

longlat

The default is FALSE, which specifies that Euclidean distance should be used. If longlat is TRUE, then the great circle distance is used to calculate the intercentroid distance.

cl

A cluster object created by makeCluster, or an integer to indicate number of child-processes (integer values are ignored on Windows) for parallel evaluations (see Details on performance). It can also be "future" to use a future backend (see Details), NULL (default) refers to sequential evaluation.

type

The type of scan statistic to compute. The default is "poisson". The other choice is "binomial".

min.cases

The minimum number of cases required for a cluster. The default is 2.

simdist

Character string indicating the simulation distribution. The default is "multinomial", which conditions on the total number of cases observed. The other options are "poisson" and "binomial"

Value

Returns a smerc_optimal_ubpop object. This includes:

ubpop_seq

The sequence of population bounds considered

elbow_method

An object with statistics related to the elbow method

gini_method

An object with statistics related to the gini method

elbow_ubpop

The population upperbound suggested by the elbow method

gini_ubpop

The population upperbound suggested by the Gini method

Author(s)

Joshua French

References

Meysami, Mohammad, French, Joshua P., and Lipner, Ettie M. The estimation of the optimal cluster upper bound for scan methods in retrospective disease surveillance. Submitted.

Han, J., Zhu, L., Kulldorff, M. et al. Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics. Int J Health Geogr 15, 27 (2016). <doi:10.1186/s12942-016-0056-6>

See Also

scan.test

Examples

data(nydf)
coords <- with(nydf, cbind(longitude, latitude))
ubpop_stats <- optimal_ubpop(
  coords = coords, cases = nydf$cases,
  pop = nydf$pop, nsim = 49,
  ubpop_seq = seq(0.05, 0.5, by = 0.05)
)
ubpop_stats
## Not run: 
plot(ubpop_stats)

## End(Not run)

jpfrench81/smerc documentation built on Jan. 13, 2024, 4:30 a.m.