View source: R/treespatial_scan.R
| treespatial_scan | R Documentation |
Performs the tree-spatial scan statistic, combining Kulldorff's circular
scan (spatial clusters) with the tree-based scan (hierarchical data
mining). Searches all combinations of spatial zones and tree branches
to identify pairs (z, g) with significantly more cases than
expected.
treespatial_scan(
cases,
population,
region_id,
x,
y,
node_id,
tree = NULL,
tree_node_id = NULL,
tree_parent_id = NULL,
max_pop_pct = 0.5,
nsim = 999L,
alpha = 0.05,
model = c("poisson", "binomial"),
seed = NULL,
n_cores = 1L
)
cases |
Numeric vector. Number of cases observed for each
(region, leaf) pair. Length |
population |
Numeric vector. Population (or denominator) of the region for each row. The same value should be repeated across all rows of a given region; if it varies, the first occurrence per region is used and a warning is issued. |
region_id |
Vector of region identifiers. Length |
x, y |
Numeric vectors of region centroid coordinates. Like
|
node_id |
Vector of tree leaf identifiers. Length |
tree |
A |
tree_node_id, tree_parent_id |
Optional. Parallel vectors describing
the tree edges, used as an alternative to |
max_pop_pct |
Numeric. Maximum proportion of total population
allowed inside a zone. Default |
nsim |
Integer. Number of Monte Carlo simulations. Default |
alpha |
Numeric. Significance level. Default |
model |
Character. |
seed |
Integer or |
n_cores |
Integer. OpenMP threads for the Monte Carlo loop.
Default |
Inputs are passed as parallel vectors of equal length (one entry per
(region, tree-leaf) observation). The user is responsible for choosing
which column to use as population (e.g., total population, live
births, person-years), making the choice of denominator explicit.
Secondary clusters. The returned object contains the most likely
cluster as well as the full set of evaluated (zone, branch) pairs in
secondary_clusters. To obtain the distinct secondary clusters as
described in Section 5.1.1 of Cancado et al. (2025) (filtering out pairs
that overlap in regions or branches with already-retained clusters), use
filter_clusters or get_cluster_regions with
n_clusters > 1.
An object of class "treespatial_scan".
Cancado, A. L. F., Oliveira, G. S., Quadros, A. V. C., & Duczmal, L. (2025). A tree-spatial scan statistic. Environmental and Ecological Statistics, 32, 953–978. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s10651-025-00670-w")}
circular_scan, tree_scan,
aggregate_tree, filter_clusters,
get_cluster_regions, iterative_scan
set.seed(123)
n_regions <- 10
tree <- data.frame(
node_id = c(1, 2, 3, 4, 5, 6, 7),
parent_id = c(NA, 1, 1, 2, 2, 3, 3)
)
# Build vectors: one row per (region, leaf) combination
grid <- expand.grid(region_id = 1:n_regions, node_id = c(4, 5, 6, 7))
xs <- runif(n_regions, 0, 10)[grid$region_id]
ys <- runif(n_regions, 0, 10)[grid$region_id]
cs <- rpois(nrow(grid), lambda = 5)
cs[grid$node_id == 4 & grid$region_id %in% 1:3] <- rpois(3, 30)
result <- treespatial_scan(
cases = cs,
population = rep(1000, nrow(grid)),
region_id = grid$region_id,
x = xs,
y = ys,
node_id = grid$node_id,
tree = tree,
nsim = 99
)
print(result)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.