earsC: Surveillance for a count data time series using the EARS C1,...

Description Usage Arguments Details Value Author(s) Source Examples

Description

The function takes range values of the surveillance time series sts and for each time point computes a threshold for the number of counts based on values from the recent past. This is then compared to the observed number of counts. If the observation is above a specific quantile of the prediction interval, then an alarm is raised. This method is especially useful for data without many reference values, since it only needs counts from the recent past.

Usage

1
2
 earsC(sts, control = list(range = NULL, method = "C1",
                            alpha = 0.001))

Arguments

sts

object of class sts (including the observed and the state time series) , which is to be monitored.

control

Control object

range

Specifies the index of all timepoints which should be tested. If range is NULL the maximum number of possible timepoints is used. This number depends on the method chosen. For C1 all timepoints from timepoint 8 can be assessed, for C2 from timepoint 10 and for C3 from timepoint 12.

method

String indicating which method to use:
"C1" for EARS C1-MILD method, "C2" for EARS C2-MEDIUM method, "C3" for EARS C3-HIGH method. By default if method is NULL C1 is chosen.

alpha

An approximate (two-sided) (1-α)\cdot 100\% prediction interval is calculated. By default if alpha is NULL 0.001 is assumed for C1 and C2 whereas 0.025 is assumed for C3. These different choices are the one made at the CDC.

Details

The three methods are different in terms of baseline used for calculation of the expected value and in terms of method for calculating the expected value:

Here is what the function does for each method:

  1. For C1 the baseline are the 7 timepoints before the assessed timepoint t, t-7 to t-1. The expected value is the mean of the baseline. An approximate (two-sided) (1-α)\cdot 100\% prediction interval is calculated based on the assumption that the difference between the expected value and the observed value divided by the standard derivation of counts over the sliding window, called C_1(t), follows a standard normal distribution in the absence of outbreaks:

    C_1(t)= \frac{Y(t)-\bar{Y}_1(t)}{S_1(t)},

    where

    \bar{Y}_1(t)= \frac{1}{7} ∑_{i=t-1}^{t-7} Y(i)

    and

    S^2_1(t)= \frac{1}{6} ∑_{i=t-1}^{t-7} [Y(i) - \bar{Y}_1(i)]^2.

    Then under the null hypothesis of no outbreak,

    C_1(t) \mathcal \sim {N}(0,1)

    An alarm is raised if

    C_1(t)≥ z_{1-α}

    with z_{1-α} the (1-α)^{th} quantile of the centered reduced normal law.

    The upperbound U_1(t) is then defined by:

    U_1(t)= \bar{Y}_1(t) + z_{1-α}S_1(t).

  2. C2 is very close to C1 apart from a 2-day lag in the baseline definition. Indeed for C2 the baseline are 7 timepoints with a 2-day lag before the assessed timepoint t, t-9 to t-3. The expected value is the mean of the baseline. An approximate (two-sided) (1-α)\cdot 100\% prediction interval is calculated based on the assumption that the difference between the expected value and the observed value divided by the standard derivation of counts over the sliding window, called C_2(t), follows a standard normal distribution in the absence of outbreaks:

    C_2(t)= \frac{Y(t)-\bar{Y}_2(t)}{S_2(t)},

    where

    \bar{Y}_2(t)= \frac{1}{7} ∑_{i=t-3}^{t-9} Y(i)

    and

    S^2_2(t)= \frac{1}{6} ∑_{i=t-3}^{t-9} [Y(i) - \bar{Y}_2(i)]^2.

    Then under the null hypothesis of no outbreak,

    C_2(t) \mathcal \sim {N}(0,1)

    An alarm is raised if

    C_2(t)≥ z_{1-α},

    with z_{1-α} the (1-α)^{th} quantile of the centered reduced normal law.

    The upperbound U_2(t) is then defined by:

    U_2(t)= \bar{Y}_2(t) + z_{1-α}S_2(t).

  3. C3 is quite different from the two other methods but it is based on C2. Indeed it uses C_2(t) from timepoint t and the two previous timepoints. This means the baseline are timepoints t-11 to t-3. The statistic C_3(t) is the sum of discrepancies between observations and predictions.

    C_3(t)= ∑_{i=t}^{t-2} \max(0,C_2(i)-1)

    Then under the null hypothesis of no outbreak,

    C_3(t) \mathcal \sim {N}(0,1)

    An alarm is raised if

    C_3(t)≥ z_{1-α},

    with z_{1-α} the (1-α)^{th} quantile of the centered reduced normal law.

    The upperbound U_3(t) is then defined by:

    U_3(t)= \bar{Y}_2(t) + S_2(t)≤ft(z_{1-α}-∑_{i=t-1}^{t-2} \max(0,C_2(i)-1)\right).

Value

An object of class sts with the slots upperbound and alarm filled by the chosen method.

Author(s)

M. Salmon

Source

Fricker, R.D., Hegler, B.L, and Dunfee, D.A. (2008). Comparing syndromic surveillance detection methods: EARS versus a CUSUM-based methodology, 27:3407-3429, Statistics in medicine.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
#Sim data and convert to sts object
disProgObj <- sim.pointSource(p = 0.99, r = 0.5, length = 208, A = 1,
                              alpha = 1, beta = 0, phi = 0,
                              frequency = 1, state = NULL, K = 1.7)
stsObj = disProg2sts( disProgObj)


#Call function and show result
res1 <- earsC(stsObj, control = list(range = 20:208,method="C1"))
plot(res1,legend.opts=list(horiz=TRUE,x="topright"),dx.upperbound=0)


# compare upperbounds depending on alpha
res3 <- earsC(stsObj, control = list(range = 20:208,method="C3",alpha = 0.001))
plot(res3@upperbound,t='l')
res3 <- earsC(stsObj, control = list(range = 20:208,method="C3"))
lines(res3@upperbound,col='red')

jimhester/surveillance documentation built on May 19, 2019, 10:33 a.m.