fit: Fit exponential models to incidence data In incidence: Compute, Handle, Plot and Model Incidence of Dated Events

Description

The function `fit` fits two exponential models to incidence data, of the form: log(y) = r * t + b
where 'y' is the incidence, 't' is time (in days), 'r' is the growth rate, and 'b' is the origin. The function `fit` will fit one model by default, but will fit two models on either side of a splitting date (typically the peak of the epidemic) if the argument `split` is provided. When groups are present, these are included in the model as main effects and interactions with dates. The function `fit_optim_split()` can be used to find the optimal 'splitting' date, defined as the one for which the best average R2 of the two models is obtained. Plotting can be done using `plot`, or added to an existing incidence plot by the piping-friendly function `add_incidence_fit()`.

Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15``` ```fit(x, split = NULL, level = 0.95, quiet = FALSE) fit_optim_split( x, window = x\$timespan/4, plot = TRUE, quiet = TRUE, separate_split = TRUE ) ## S3 method for class 'incidence_fit' print(x, ...) ## S3 method for class 'incidence_fit_list' print(x, ...) ```

Arguments

 `x` An incidence object, generated by the function `incidence()`. For the plotting function, an `incidence_fit` object. `split` An optional time point identifying the separation between the two models. If NULL, a single model is fitted. If provided, two models would be fitted on the time periods on either side of the split. `level` The confidence interval to be used for predictions; defaults to 95%. `quiet` A logical indicating if warnings from `fit` should be hidden; FALSE by default. Warnings typically indicate some zero incidence, which are removed before performing the log-linear regression. `window` The size, in days, of the time window either side of the split. `plot` A logical indicating whether a plot should be added to the output (`TRUE`, default), showing the mean R2 for various splits. `separate_split` If groups are present, should separate split dates be determined for each group? Defaults to `TRUE`, in which separate split dates and thus, separate models will be constructed for each group. When `FALSE`, the split date will be determined from the pooled data and modelled with the groups as main effects and interactions with date. `...` currently unused.

Value

For `fit()`, a list with the class `incidence_fit` (for a single model), or a list containing two `incidence_fit` objects (when fitting two models). `incidence_fit` objects contain:

• `\$model`: the fitted linear model

• `\$info`: a list containing various information extracted from the model (detailed further)

• `\$origin`: the date corresponding to day '0'

The `\$info` item is a list containing:

• `r`: the growth rate

• `r.conf`: the confidence interval of 'r'

• `pred`: a `data.frame` containing predictions of the model, including the true dates (`dates`), their numeric version used in the model (`dates.x`), the predicted value (`fit`), and the lower (`lwr`) and upper (`upr`) bounds of the associated confidence interval.

• `doubling`: the predicted doubling time in days; exists only if 'r' is positive

• `doubling.conf`: the confidence interval of the doubling time

• `halving`: the predicted halving time in days; exists only if 'r' is negative

• `halving.conf`: the confidence interval of the halving time

For `fit_optim_split`, a list containing:

• `df`: a `data.frame` of dates that were used in the optimization procedure, and the corresponding average R2 of the resulting models.

• `split`: the optimal splitting date

• `fit`: an `incidence_fit_list` object containing the fit for each split. If the `separate_split = TRUE`, then the `incidence_fit_list` object will contain these splits nested within each group. All of the `incidence_fit` objects can be retrieved with `get_fit()`.

• `plot`: a plot showing the content of `df` (ggplot2 object)

Author(s)

Thibaut Jombart thibautjombart@gmail.com, Zhian N. Kamvar zkamvar@gmail.com.

See Also

the `incidence()` function to generate the 'incidence' objects. The `get_fit()` function to flatten `incidence_fit_list` objects to a list of `incidence_fit` objects.

Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41``` ```if (require(outbreaks)) { withAutoprint({ dat <- ebola_sim\$linelist\$date_of_onset ## EXAMPLE WITH A SINGLE MODEL ## compute weekly incidence i.7 <- incidence(dat, interval=7) plot(i.7) plot(i.7[1:20]) ## fit a model on the first 20 weeks f <- fit(i.7[1:20]) f names(f) head(get_info(f, "pred")) ## plot model alone (not recommended) plot(f) ## plot data and model (recommended) plot(i.7, fit = f) plot(i.7[1:25], fit = f) ## piping versions if (require(magrittr)) { withAutoprint({ plot(i.7) %>% add_incidence_fit(f) ## EXAMPLE WITH 2 PHASES ## specifying the peak manually f2 <- fit(i.7, split = as.Date("2014-10-15")) f2 plot(i.7) %>% add_incidence_fit(f2) ## finding the best 'peak' date f3 <- fit_optim_split(i.7) f3 plot(i.7) %>% add_incidence_fit(f3\$fit) })} })} ```

