kulldorff: Kulldorff Cluster Detection Method

View source: R/kulldorff.R

kulldorffR Documentation

Kulldorff Cluster Detection Method

Description

Kulldorff spatial cluster detection method for a study region with n areas. The method constructs zones by consecutively aggregating nearest-neighboring areas until a proportion of the total study population is included. Given the observed number of cases, the likelihood of each zone is computed using either binomial or poisson likelihoods. The procedure reports the zone that is the most likely cluster and generates significance measures via Monte Carlo sampling. Further, secondary clusters, whose Monte Carlo p-values are below the α-threshold, are reported as well.

Usage

kulldorff(
  geo,
  cases,
  population,
  expected.cases = NULL,
  pop.upper.bound,
  n.simulations,
  alpha.level,
  plot = TRUE
)

Arguments

geo

an n x 2 table of the (x,y)-coordinates of the area centroids

cases

aggregated case counts for all n areas

population

aggregated population counts for all n areas

expected.cases

expected numbers of disease for all n areas

pop.upper.bound

the upper bound on the proportion of the total population each zone can include

n.simulations

number of Monte Carlo samples used for significance measures

alpha.level

alpha-level threshold used to declare significance

plot

flag for whether to plot histogram of Monte Carlo samples of the log-likelihood of the most likely cluster

Details

If expected.cases is specified to be NULL, then the binomial likelihood is used. Otherwise, a Poisson model is assumed. Typical values of n.simulations are 99, 999, 9999

Value

List containing:

most.likely.cluster

information on the most likely cluster

secondary.clusters

information on secondary clusters, if none NULL is returned

type

type of likelihood

log.lkhd

log-likelihood of each zone considered

simulated.log.lkhd

n.simulations Monte Carlo samples of the log-likelihood of the most likely cluster

Note

The most.likely.cluster and secondary.clusters list elements are themselves lists reporting:

location.IDs.included ID's of areas in cluster, in order of distance
population population of cluster
number.of.cases number of cases in cluster
expected.cases expected number of cases in cluster
SMR estimated SMR of cluster
log.likelihood.ratio log-likelihood of cluster
monte.carlo.rank rank of lkhd of cluster within Monte Carlo simulated values
p.value Monte Carlo p-value

Author(s)

Albert Y. Kim

References

SatScan: Software for the spatial, temporal, and space-time scan statistics https://www.satscan.org/ Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics: Theory and Methods, 26, 1481–1496. Kulldorff M. and Nagarwalla N. (1995) Spatial disease clusters: Detection and Inference. Statistics in Medicine, 14, 799–810.

Examples

## Load Pennsylvania Lung Cancer Data
data(pennLC)
data <- pennLC$data

## Process geographical information and convert to grid
geo <- pennLC$geo[,2:3]
geo <- latlong2grid(geo)

## Get aggregated counts of population and cases for each county
population <- tapply(data$population,data$county,sum)
cases <- tapply(data$cases,data$county,sum)

## Based on the 16 strata levels, computed expected numbers of disease
n.strata <- 16
expected.cases <- expected(data$population, data$cases, n.strata)

## Set Parameters
pop.upper.bound <- 0.5
n.simulations <- 999
alpha.level <- 0.05
plot <- TRUE

## Kulldorff using Binomial likelihoods
binomial <- kulldorff(geo, cases, population, NULL, pop.upper.bound, n.simulations, 
                     alpha.level, plot)
cluster <- binomial$most.likely.cluster$location.IDs.included

## plot
plot(pennLC$spatial.polygon,axes=TRUE)
plot(pennLC$spatial.polygon[cluster],add=TRUE,col="red")
title("Most Likely Cluster")

## Kulldorff using Poisson likelihoods
poisson <- kulldorff(geo, cases, population, expected.cases, pop.upper.bound, 
                    n.simulations, alpha.level, plot)
cluster <- poisson$most.likely.cluster$location.IDs.included

## plot
plot(pennLC$spatial.polygon,axes=TRUE)
plot(pennLC$spatial.polygon[cluster],add=TRUE,col="red")
title("Most Likely Cluster Controlling for Strata")

SpatialEpi documentation built on March 7, 2023, 8 p.m.