besag_newell | R Documentation |
Besag-Newell cluster detection method. There are differences with the original paper and our implementation:
we base our analysis on k cases, rather than k other cases as prescribed in the paper.
we do not subtract 1 from the accumulated numbers of other cases and accumulated numbers of others at risk, as was prescribed in the paper to discount selection bias
M is the total number of areas included, not the number of additional areas included. i.e. M starts at 1, not 0.
p-values are not based on the original value of k, rather the actual number of cases observed until we view k or more cases. Ex: if k = 10, but as we consider neighbors we encounter 1, 2, 9 then 12 cases, we base our p-values on k=12
we do not provide a Monte-Carlo simulated R: the number of tests that attain significance at a fixed level α
The first two and last differences are because we view the testing on an area-by-area level, rather than a case-by-case level.
besag_newell(geo, population, cases, expected.cases = NULL, k, alpha.level)
geo |
an |
population |
aggregated population counts for all |
cases |
aggregated case counts for all |
expected.cases |
expected numbers of disease for all |
k |
number of cases to consider |
alpha.level |
alpha-level threshold used to declare significance |
For the population
and cases
tables, the rows are bunched by areas first, and then for each area, the counts for each strata are listed. It is important that the tables are balanced: the strata information are in the same order for each area, and counts for each area/strata combination appear exactly once (even if zero).
List containing
clusters |
information on all clusters that are α-level significant, in decreasing order of the p-value |
p.values |
for each of the n areas, p-values of each cluster of size at least k |
m.values |
for each of the n areas, the number of areas need to observe at least k cases |
observed.k.values |
based on |
The clusters
list elements are themselves lists reporting:
location.IDs.included | ID's of areas in cluster, in order of distance |
population | population of cluster |
number.of.cases | number of cases in cluster |
expected.cases | expected number of cases in cluster |
SMR | estimated SMR of cluster |
p.value | p-value |
Albert Y. Kim
Besag J. and Newell J. (1991) The Detection of Clusters in Rare Diseases Journal of the Royal Statistical Society. Series A (Statistics in Society), 154, 143–155
## Load Pennsylvania Lung Cancer Data data(pennLC) data <- pennLC$data ## Process geographical information and convert to grid geo <- pennLC$geo[,2:3] geo <- latlong2grid(geo) ## Get aggregated counts of population and cases for each county population <- tapply(data$population,data$county,sum) cases <- tapply(data$cases,data$county,sum) ## Based on the 16 strata levels, computed expected numbers of disease n.strata <- 16 expected.cases <- expected(data$population, data$cases, n.strata) ## Set Parameters k <- 1250 alpha.level <- 0.05 # not controlling for stratas results <- besag_newell(geo, population, cases, expected.cases=NULL, k, alpha.level) # controlling for stratas results <- besag_newell(geo, population, cases, expected.cases, k, alpha.level)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.