View source: R/iterative_scan.R
| iterative_scan | R Documentation |
Runs the scan statistic iteratively, removing the cases of each detected cluster before the next iteration. Supports tree-spatial, circular, and tree-only scans, dispatched by which arguments are supplied.
iterative_scan(
cases = NULL,
population = NULL,
region_id = NULL,
x = NULL,
y = NULL,
node_id = NULL,
tree = NULL,
tree_node_id = NULL,
tree_parent_id = NULL,
max_iter = 5L,
alpha = 0.05,
nsim = 999L,
max_pop_pct = 0.5,
model = c("poisson", "binomial"),
seed = NULL,
verbose = TRUE,
n_cores = 1L
)
cases |
Numeric vector. For tree-spatial scan: one entry per (region, leaf) row. For circular: one entry per region. For tree-only: one entry per leaf. |
population |
Numeric vector parallel to |
region_id, x, y |
Vectors parallel to |
node_id |
Vector parallel to |
tree |
Tree as a 2-column data.frame ( |
tree_node_id, tree_parent_id |
Optional. Parallel vectors as an
alternative to |
max_iter |
Integer. Maximum number of iterations. |
alpha |
Numeric. Significance threshold applied to Holm-Bonferroni adjusted p-values. |
nsim |
Integer. MC simulations per iteration. |
max_pop_pct |
Numeric. Passed to inner scans. |
model |
Character. |
seed |
Integer or |
verbose |
Logical. |
n_cores |
Integer. OpenMP threads. |
Note on methodology. This iterative procedure is not part
of the original tree-spatial scan statistic of Cancado et al. (2025). It is
an extension inspired by the conditional scan approach of Zhang et al.
(2010), which sequentially removes cases attributed to detected clusters
before re-running the scan to find additional, distinct anomalies. To
reproduce the secondary-cluster procedure described in Section 5.1.1 of
Cancado et al. (2025), use filter_clusters or
get_cluster_regions with n_clusters > 1 on a single
scan result instead.
Multiple-testing correction. Because each iteration is a separate
hypothesis test on data that has been modified by the previous iteration,
the raw p-values overstate significance. This function collects raw p-values
from all iterations performed and applies the Holm-Bonferroni correction
(p.adjust with method = "holm") at the end. The
returned clusters data.frame includes both the raw p-value
(pvalue) and the adjusted p-value (pvalue_adjusted), plus a
logical significant column indicating whether the cluster is
significant after correction at level alpha.
The loop stops early only when the residual signal is exhausted (the most
likely cluster has LR = 0 or zero cases), not based on raw p-values,
since the multiple-testing correction depends on the full set of tests.
An object of class "iterative_scan" with components:
A data.frame with one row per iteration, containing
the raw p-value (pvalue), the Holm-Bonferroni adjusted
p-value (pvalue_adjusted), a logical significant
column, plus the cluster's node, regions, cases, expected, RR, and
LLR.
A list with the full scan result of each iteration.
Bookkeeping fields.
Cancado, A. L. F., Oliveira, G. S., Quadros, A. V. C., & Duczmal, L. H. (2025). A tree-spatial scan statistic. Environmental and Ecological Statistics, 32, 953-978.
Kulldorff, M. (1997). A spatial scan statistic. Communications in Statistics - Theory and Methods, 26(6), 1481-1496.
Zhang, Z., Assuncao, R., & Kulldorff, M. (2010). Spatial scan statistics adjusted for multiple clusters. Journal of Probability and Statistics, 2010, 642379.
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6(2), 65-70.
treespatial_scan, circular_scan,
tree_scan, filter_clusters,
get_cluster_regions
data(london_collisions); data(london_tree)
result <- iterative_scan(
cases = london_collisions$cases,
population = london_collisions$population,
region_id = london_collisions$region_id,
x = london_collisions$x,
y = london_collisions$y,
node_id = london_collisions$node_id,
tree = london_tree,
max_iter = 3, nsim = 99, seed = 42
)
print(result)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.