kulldorff | R Documentation |
Kulldorff spatial cluster detection method for a study region with n
areas. The method constructs zones by consecutively aggregating nearest-neighboring areas until a proportion of the total study population is included. Given the observed number of cases, the likelihood of each zone is computed using either binomial or poisson likelihoods. The procedure reports the zone that is the most likely cluster and generates significance measures via Monte Carlo sampling. Further, secondary clusters, whose Monte Carlo p-values are below the α-threshold, are reported as well.
kulldorff( geo, cases, population, expected.cases = NULL, pop.upper.bound, n.simulations, alpha.level, plot = TRUE )
geo |
an |
cases |
aggregated case counts for all |
population |
aggregated population counts for all |
expected.cases |
expected numbers of disease for all |
pop.upper.bound |
the upper bound on the proportion of the total population each zone can include |
n.simulations |
number of Monte Carlo samples used for significance measures |
alpha.level |
alpha-level threshold used to declare significance |
plot |
flag for whether to plot histogram of Monte Carlo samples of the log-likelihood of the most likely cluster |
If expected.cases
is specified to be NULL
, then the binomial likelihood is used. Otherwise, a Poisson model is assumed. Typical values of n.simulations
are 99
, 999
, 9999
List containing:
most.likely.cluster |
information on the most likely cluster |
secondary.clusters |
information on secondary clusters, if none |
type |
type of likelihood |
log.lkhd |
log-likelihood of each zone considered |
simulated.log.lkhd |
|
The most.likely.cluster
and secondary.clusters
list elements are themselves lists reporting:
location.IDs.included | ID's of areas in cluster, in order of distance |
population | population of cluster |
number.of.cases | number of cases in cluster |
expected.cases | expected number of cases in cluster |
SMR | estimated SMR of cluster |
log.likelihood.ratio | log-likelihood of cluster |
monte.carlo.rank | rank of lkhd of cluster within Monte Carlo simulated values |
p.value | Monte Carlo p-value |
Albert Y. Kim
SatScan: Software for the spatial, temporal, and space-time scan statistics https://www.satscan.org/ Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics: Theory and Methods, 26, 1481–1496. Kulldorff M. and Nagarwalla N. (1995) Spatial disease clusters: Detection and Inference. Statistics in Medicine, 14, 799–810.
## Load Pennsylvania Lung Cancer Data data(pennLC) data <- pennLC$data ## Process geographical information and convert to grid geo <- pennLC$geo[,2:3] geo <- latlong2grid(geo) ## Get aggregated counts of population and cases for each county population <- tapply(data$population,data$county,sum) cases <- tapply(data$cases,data$county,sum) ## Based on the 16 strata levels, computed expected numbers of disease n.strata <- 16 expected.cases <- expected(data$population, data$cases, n.strata) ## Set Parameters pop.upper.bound <- 0.5 n.simulations <- 999 alpha.level <- 0.05 plot <- TRUE ## Kulldorff using Binomial likelihoods binomial <- kulldorff(geo, cases, population, NULL, pop.upper.bound, n.simulations, alpha.level, plot) cluster <- binomial$most.likely.cluster$location.IDs.included ## plot plot(pennLC$spatial.polygon,axes=TRUE) plot(pennLC$spatial.polygon[cluster],add=TRUE,col="red") title("Most Likely Cluster") ## Kulldorff using Poisson likelihoods poisson <- kulldorff(geo, cases, population, expected.cases, pop.upper.bound, n.simulations, alpha.level, plot) cluster <- poisson$most.likely.cluster$location.IDs.included ## plot plot(pennLC$spatial.polygon,axes=TRUE) plot(pennLC$spatial.polygon[cluster],add=TRUE,col="red") title("Most Likely Cluster Controlling for Strata")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.