View source: R/farringtonFlexible.R
farringtonFlexible | R Documentation |
The function takes range
values of the surveillance time
series sts
and for each time point uses a Poisson GLM with overdispersion to
predict an upper bound on the number of counts according to the procedure by
Farrington et al. (1996) and by Noufaily et al. (2012). This bound is then compared to the observed
number of counts. If the observation is above the bound, then an alarm is raised.
The implementation is illustrated in Salmon et al. (2016).
farringtonFlexible(sts, control = list(
range = NULL, b = 5, w = 3,
reweight = TRUE, weightsThreshold = 2.58,
verbose = FALSE, glmWarnings = TRUE,
alpha = 0.05, trend = TRUE, pThresholdTrend = 0.05,
limit54 = c(5,4), powertrans = "2/3",
fitFun = "algo.farrington.fitGLM.flexible",
populationOffset = FALSE,
noPeriods = 1, pastWeeksNotIncluded = NULL,
thresholdMethod = "delta"))
sts |
object of class |
control |
Control object given as a
|
The following steps are performed according to the Farrington et al. (1996) paper.
Fit of the initial model with intercept, time trend if trend
is TRUE
,
seasonal factor variable if noPeriod
is bigger than 1, and population offset if
populationOffset
is TRUE
. Initial estimation of mean and
overdispersion.
Calculation of the weights omega (correction for past outbreaks) if reweighting
is TRUE
.
The threshold for reweighting is defined in control
.
Refitting of the model
Revised estimation of overdispersion
Omission of the trend, if it is not significant
Repetition of the whole procedure
Calculation of the threshold value using the model to compute a quantile of the predictive distribution.
The method used depends on thresholdMethod
, this can either be:
One assumes that the prediction error (or a transformation of the prediction
error, depending on powertrans
), is normally distributed. The threshold is deduced from a quantile of
this normal distribution using the variance and estimate of the expected
count given by GLM, and the delta rule. The procedure takes into account both the estimation error (variance of the estimator
of the expected count in the GLM) and the prediction error (variance of the prediction error). This is the suggestion
in Farrington et al. (1996).
One assumes that the new count follows a negative binomial distribution parameterized by the expected count and the overdispersion estimated in the GLM. The threshold is deduced from a quantile of this discrete distribution. This process disregards the estimation error, though. This method was used in Noufaily, et al. (2012).
One also uses the assumption of the negative binomial sampling distribution but does not plug in the estimate of the expected count from the GLM, instead one uses a quantile from the asymptotic normal distribution of the expected count estimated in the GLM; in order to take into account both the estimation error and the prediction error.
Computation of exceedance score
Warning: monthly data containing the last day of each month as date should be analysed with epochAsDate=FALSE
in the sts
object. Otherwise February makes it impossible to find some reference time points.
An object of class sts
with the slots upperbound
and alarm
filled by appropriate output of the algorithm.
The control
slot of the input sts
is amended with the
following matrix elements, all with length(range)
rows:
Booleans indicating whether a time trend was fitted for this time point.
coefficient of the time trend in the GLM for this time point. If no trend was fitted it is equal to NA.
probability of observing a value at least equal to the observation under the null hypothesis .
expectation of the predictive distribution for each timepoint. It is only reported if the conditions for raising an alarm are met (enough cases).
input for the negative binomial distribution to get the upperbound as a quantile (either a plug-in from the GLM or a quantile from the asymptotic normal distribution of the estimator)
overdispersion of the GLM at each timepoint.
M. Salmon, M. Höhle
Farrington, C.P., Andrews, N.J, Beale A.D. and Catchpole, M.A. (1996): A statistical algorithm for the early detection of outbreaks of infectious disease. J. R. Statist. Soc. A, 159, 547-563.
Noufaily, A., Enki, D.G., Farrington, C.P., Garthwaite, P., Andrews, N.J., Charlett, A. (2012): An improved algorithm for outbreak detection in multiple surveillance systems. Statistics in Medicine, 32 (7), 1206-1222.
Salmon, M., Schumacher, D. and Höhle, M. (2016): Monitoring count time series in R: Aberration detection in public health surveillance. Journal of Statistical Software, 70 (10), 1-35. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v070.i10")}
algo.farrington.fitGLM
,algo.farrington.threshold
data("salmonella.agona")
# Create the corresponding sts object from the old disProg object
salm <- disProg2sts(salmonella.agona)
### RUN THE ALGORITHMS WITH TWO DIFFERENT SETS OF OPTIONS
control1 <- list(range=282:312,
noPeriods=1,
b=4, w=3, weightsThreshold=1,
pastWeeksNotIncluded=3,
pThresholdTrend=0.05,
alpha=0.1)
control2 <- list(range=282:312,
noPeriods=10,
b=4, w=3, weightsThreshold=2.58,
pastWeeksNotIncluded=26,
pThresholdTrend=1,
alpha=0.1)
salm1 <- farringtonFlexible(salm,control=control1)
salm2 <- farringtonFlexible(salm,control=control2)
### PLOT THE RESULTS
y.max <- max(upperbound(salm1),observed(salm1),upperbound(salm2),na.rm=TRUE)
plot(salm1, ylim=c(0,y.max), main='S. Newport in Germany', legend.opts=NULL)
lines(1:(nrow(salm1)+1)-0.5,
c(upperbound(salm1),upperbound(salm1)[nrow(salm1)]),
type="s",col='tomato4',lwd=2)
lines(1:(nrow(salm2)+1)-0.5,
c(upperbound(salm2),upperbound(salm2)[nrow(salm2)]),
type="s",col="blueviolet",lwd=2)
legend("topleft",
legend=c('Alarm','Upperbound with old options',
'Upperbound with new options'),
pch=c(24,NA,NA),lty=c(NA,1,1),
bg="white",lwd=c(2,2,2),col=c('red','tomato4',"blueviolet"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.