iterative_scan: Iterative Scan

View source: R/iterative_scan.R

iterative_scanR Documentation

Iterative Scan

Description

Runs the scan statistic iteratively, removing the cases of each detected cluster before the next iteration. Supports tree-spatial, circular, and tree-only scans, dispatched by which arguments are supplied.

Usage

iterative_scan(
  cases = NULL,
  population = NULL,
  region_id = NULL,
  x = NULL,
  y = NULL,
  node_id = NULL,
  tree = NULL,
  tree_node_id = NULL,
  tree_parent_id = NULL,
  max_iter = 5L,
  alpha = 0.05,
  nsim = 999L,
  max_pop_pct = 0.5,
  model = c("poisson", "binomial"),
  seed = NULL,
  verbose = TRUE,
  n_cores = 1L
)

Arguments

cases

Numeric vector. For tree-spatial scan: one entry per (region, leaf) row. For circular: one entry per region. For tree-only: one entry per leaf.

population

Numeric vector parallel to cases.

region_id, x, y

Vectors parallel to cases (omit for tree-only scan).

node_id

Vector parallel to cases (omit for circular and tree-only scans).

tree

Tree as a 2-column data.frame (node_id, parent_id), or use tree_node_id/tree_parent_id. Omit for circular scan.

tree_node_id, tree_parent_id

Optional. Parallel vectors as an alternative to tree.

max_iter

Integer. Maximum number of iterations.

alpha

Numeric. Significance threshold applied to Holm-Bonferroni adjusted p-values.

nsim

Integer. MC simulations per iteration.

max_pop_pct

Numeric. Passed to inner scans.

model

Character. "poisson" or "binomial".

seed

Integer or NULL.

verbose

Logical.

n_cores

Integer. OpenMP threads.

Details

Note on methodology. This iterative procedure is not part of the original tree-spatial scan statistic of Cancado et al. (2025). It is an extension inspired by the conditional scan approach of Zhang et al. (2010), which sequentially removes cases attributed to detected clusters before re-running the scan to find additional, distinct anomalies. To reproduce the secondary-cluster procedure described in Section 5.1.1 of Cancado et al. (2025), use filter_clusters or get_cluster_regions with n_clusters > 1 on a single scan result instead.

Multiple-testing correction. Because each iteration is a separate hypothesis test on data that has been modified by the previous iteration, the raw p-values overstate significance. This function collects raw p-values from all iterations performed and applies the Holm-Bonferroni correction (p.adjust with method = "holm") at the end. The returned clusters data.frame includes both the raw p-value (pvalue) and the adjusted p-value (pvalue_adjusted), plus a logical significant column indicating whether the cluster is significant after correction at level alpha.

The loop stops early only when the residual signal is exhausted (the most likely cluster has LR = 0 or zero cases), not based on raw p-values, since the multiple-testing correction depends on the full set of tests.

Value

An object of class "iterative_scan" with components:

clusters

A data.frame with one row per iteration, containing the raw p-value (pvalue), the Holm-Bonferroni adjusted p-value (pvalue_adjusted), a logical significant column, plus the cluster's node, regions, cases, expected, RR, and LLR.

iterations

A list with the full scan result of each iteration.

regions, tree, alpha, n_iter, scan_type

Bookkeeping fields.

References

Cancado, A. L. F., Oliveira, G. S., Quadros, A. V. C., & Duczmal, L. H. (2025). A tree-spatial scan statistic. Environmental and Ecological Statistics, 32, 953-978.

Kulldorff, M. (1997). A spatial scan statistic. Communications in Statistics - Theory and Methods, 26(6), 1481-1496.

Zhang, Z., Assuncao, R., & Kulldorff, M. (2010). Spatial scan statistics adjusted for multiple clusters. Journal of Probability and Statistics, 2010, 642379.

Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6(2), 65-70.

See Also

treespatial_scan, circular_scan, tree_scan, filter_clusters, get_cluster_regions

Examples


data(london_collisions); data(london_tree)
result <- iterative_scan(
  cases       = london_collisions$cases,
  population  = london_collisions$population,
  region_id   = london_collisions$region_id,
  x           = london_collisions$x,
  y           = london_collisions$y,
  node_id     = london_collisions$node_id,
  tree        = london_tree,
  max_iter = 3, nsim = 99, seed = 42
)
print(result)


treeSS documentation built on May 16, 2026, 1:08 a.m.