R/incidence.R
In incidence: Compute, Handle, Plot and Model Incidence of Dated Events

Documented in incidence incidence.character incidence.Date incidence.default incidence.integer incidence.numeric incidence.POSIXt

#' Compute incidence of events from a vector of dates.
#'
#' This function computes incidence based on dates of events provided in
#' various formats. A fixed interval, provided as numbers of days, is used to
#' define time intervals. Counts within an interval always include the first
#' date, after which they are labeled, and exclude the second. For instance,
#' intervals labeled as 0, 3, 6, ... mean that the first bin includes days 0, 1
#' and 2, the second interval includes 3, 4 and 5 etc.
#'
#' @param dates A vector of dates, which can be provided as objects of the
#' class: integer, numeric, Date, POSIXct, POSIXlt, and character. (See Note 
#' about `numeric` and `character` formats) 
#'
#' @param interval An integer or character indicating the (fixed) size of the
#' time interval used for computing the incidence; defaults to 1 day. This can
#' also be a text string that corresponds to a valid date interval: day, week,
#' month, quarter, or year. (See Note).
#'
#' @param groups An optional factor defining groups of observations for which
#' incidence should be computed separately.
#'
#' @param na_as_group A logical value indicating if missing group (NA) should be
#' treated as a separate group.
#'
#' @param first_date,last_date optional first/last dates to be used in the
#'   epicurve. When these are `NULL` (default), the dates from the first/last
#'   dates are taken from the observations. If these dates are provided, the
#'   observations will be trimmed to the range of \[first_date, last_date\].
#'
#' @param ... Additional arguments passed to other methods (none are used).
#'
#' @return An list with the class `incidence`, which contains the
#' following items:
#'
#'
#' - **dates**: The dates marking the left side of the bins used for counting
#' events. When `standard = TRUE` and the interval represents weeks, months,
#' quarters, or years, the first date will represent the first standard date
#' (See Interval specification, below).
#'    
#' - **counts**: A matrix of incidence counts, which one column per group (and
#' a single column if no groups were used).
#'
#' - **timespan**: The length of the period for which incidence is computed, in
#' days.
#'
#' - **interval**: The bin size. If it's an integer, it represents the number
#'   of days between each bin. It can also be a character, e.g. "2 weeks" or 
#'   "6 months".
#'
#' - **n**: The total number of cases.
#'
#' - **weeks**: Dates in week format (YYYY-Www), where YYYY corresponds to the
#'   year of the given week and ww represents the numeric week of the year. 
#'   This will be a produced from the function [aweek::date2week()]. Note that
#'   these will have a special `"week_start"` attribute indicating which day of
#'   the ISO week the week starts on (see Weeks, below).
#'   
#' - **isoweeks**: ISO 8601 week format YYYY-Www, which is returned only when
#' ISO week-based weekly incidence is computed.
#'
#'
#' @details For details about the `incidence class`, see the dedicated
#' vignette:\cr `vignette("incidence_class", package = "incidence")`
#'
#' @note \subsection{Input data (`dates`)}{
#'  - **Decimal (numeric) dates**: will be truncated with a warning
#'  - **Character dates** should be in the unambiguous `yyyy-mm-dd` (ISO 8601)
#'   format. Any other format will trigger an error.
#' }
#' 
#' \subsection{Interval specification (`interval`)}{
#' If `interval` is a valid character (e.g. "week" or "1 month"), then
#' the bin will start at the beginning of the interval just before the first
#' observation by default. For example, if the first case was recorded on
#' Wednesday, 2018-05-09:
#'
#'  - "week"    : first day of the week (i.e. Monday, 2018-05-07) (defaults to ISO weeks, see "Week intervals", below)
#'  - "month"   : first day of the month (i.e. 2018-05-01)
#'  - "quarter" : first day of the quarter (i.e. 2018-04-01)
#'  - "year"    : first day of the calendar year (i.e. 2018-01-01)
#'
#' These default intervals can be overridden with `standard = FALSE`, which
#' sets the interval to begin at the first observed case.
#' }
#' 
#' \subsection{Week intervals}{
#'
#' As of _incidence_ version 1.7.0, it is possible to construct standardized
#' incidence objects standardized to any day of the week thanks to the
#' [aweek::date2week()] function from the \pkg{aweek} package. The default
#' state is to use ISO 8601 definition of weeks, which start on Monday. You can
#' specify the day of the week an incidence object should be standardised to by
#' using the pattern "{n} {W} weeks" where "{W}" represents the weekday in an
#' English or current locale and "{n}" represents the duration, but this can be
#' ommitted.  Below are examples of specifying weeks starting on different days
#' assuming we had data that started on 2016-09-05, which is ISO week 36 of
#' 2016:
#'
#'  - interval = "2 monday weeks" (Monday 2016-09-05)
#'  - interval = "1 tue week" (Tuesday 2016-08-30)
#'  - interval = "1 Wed week" (Wednesday 2016-08-31)
#'  - interval = "1 Thursday week" (Thursday 2016-09-01) 
#'  - interval = "1 F week" (Friday 2016-09-02)
#'  - interval = "1 Saturday week" (Saturday 2016-09-03)
#'  - interval = "Sunday week" (Sunday 2016-09-04)
#'
#' It's also possible to use something like "3 weeks: Saturday"; In addition,
#' there are keywords reserved for specific days of the week:
#'   
#'   - interval = "week", standard = TRUE (Default, Monday)
#'   - interval = "ISOweek"  (Monday)
#'   - interval = "EPIweek"  (Sunday)
#'   - interval = "MMWRweek" (Sunday)
#'
#' The "EPIweek" specification is not strictly reserved for CDC epiweeks, but
#' can be prefixed (or posfixed) by a day of the week: "1 epiweek: Saturday".
#'
#' }
#'
#' \subsection{The `first_date` argument}{
#' Previous versions of _incidence_ had the `first_date` argument override
#' `standard = TRUE`. It has been changed as of _incidence_ version 1.6.0 to
#' be more consistent with the behavior when `first_date = NULL`. This, however
#' may be a change in behaviour, so a warning is now issued once and only once
#' if `first_date` is specified, but `standard` is not. To never see this
#' warning, use `options(incidence.warn.first_date = FALSE)`. 
#' }
#'
#' The intervals for "month", "quarter", and "year" will necessarily vary in the
#' number of days they encompass and warnings will be generated when the first
#' date falls outside of a calendar date that is easily represented across the
#' interval.
#' 
#'
#' @seealso
#' The main other functions of the package include:
#'
#'  - [incidence::plot.incidence()]: Plot epicurves from an incidence object.
#'
#'  - [incidence::fit()]: Fit log-linear model to computed incidence.
#'
#'  - [incidence::fit_optim_split()]: Find the optimal peak of the epidemic
#' and fits log-linear models on either side of the peak.
#'
#'  - [incidence::subset()]: Handling of `incidence`
#' objects.
#'
#'  - [incidence::pool()]: Sum incidence over groups.
#'
#'  - [incidence::as.data.frame.incidence()]: Convert an `incidence` object to a
#' `data.frame`.
#'
#' The following vignettes are also available:
#'
#' - `overview`: Provides an overview of the package's features.
#'
#' - `customize_plot`: Provides some tips on finer plot customization.
#'
#' - `incidence_class`: Details the content of the `incidence`
#' class.
#'
#'
#'
#'
#' @author Thibaut Jombart, Rich Fitzjohn, Zhian Kamvar
#'
#' @rdname incidence
#'
#' @importFrom utils head tail
#'
#' @export
#'
#' @examples
#' ## toy example
#' incidence(c(1, 5, 8, 3, 7, 2, 4, 6, 9, 2))
#' incidence(c(1, 5, 8, 3, 7, 2, 4, 6, 9, 2), 2)
#'
#' ## example using simulated dataset
#' if(require(outbreaks)) { withAutoprint({
#'   onset <- outbreaks::ebola_sim$linelist$date_of_onset
#'
#'   ## daily incidence
#'   inc <- incidence(onset)
#'   inc
#'   plot(inc)
#'
#'   ## weekly incidence
#'   inc.week <- incidence(onset, interval = 7, standard = FALSE)
#'   inc.week
#'   plot(inc.week)
#'   plot(inc.week, border = "white") # with visible border
#'   
#'   # Starting on Monday
#'   inc.isoweek <- incidence(onset, interval = "isoweek")
#'   inc.isoweek
#'   
#'   # Starting on Sunday
#'   inc.epiweek <- incidence(onset, interval = "epiweek")
#'   inc.epiweek
#' 
#'   # Starting on Saturday
#'   inc.epiweek <- incidence(onset, interval = "saturday epiweek")
#'   inc.epiweek
#'
#'   ## use group information
#'   sex <- outbreaks::ebola_sim$linelist$gender
#'   inc.week.gender <- incidence(onset, interval = 7,
#'                                groups = sex, standard = FALSE)
#'   inc.week.gender
#'   head(inc.week.gender$counts)
#'   plot(inc.week.gender, border = "grey90")
#'   inc.satweek.gender <- incidence(onset, interval = "2 epiweeks: saturday",
#'                                   groups = sex)
#'   inc.satweek.gender
#'   plot(inc.satweek.gender, border = "grey90")
#'
#' })}
#' 
#' # Use of first_date
#' d <- Sys.Date() + sample(-3:10, 10, replace = TRUE)
#' 
#' # `standard` specified, no warning
#' di <- incidence(d, interval = "week", first_date = Sys.Date() - 10, standard = TRUE)
#'
#' # warning issued if `standard` not specified
#' di <- incidence(d, interval = "week", first_date = Sys.Date() - 10)
#'
#' # second instance: no warning issued
#' di <- incidence(d, interval = "week", first_date = Sys.Date() - 10)
#'
#'
incidence <- function(dates, interval = 1L, ...) {
  UseMethod("incidence")
}


#' @export
#' @rdname incidence
incidence.default <- function(dates, interval = 1L, ...) {
  incidence(check_dates(dates), interval = interval, ...)
}

#' @export
#' @rdname incidence
#'
#' @param standard (Only applicable to Date objects) When `TRUE` (default) and the
#'   `interval` one of "week", "month", "quarter", or "year", then this will
#'   cause the bins for the counts to start at the beginning of the interval
#'   (See Note). 

incidence.Date <- function(dates, interval = 1L, standard = TRUE, groups = NULL,
                           na_as_group = TRUE, first_date = NULL,
                           last_date = NULL, ...) {
  the_call <- match.call()
  warnme <- getOption('incidence.warn.first_date', FALSE) 
  if (warnme && !is.null(the_call[["first_date"]]) && is.null(the_call[["standard"]])) {
    fd  <- as.character(deparse(the_call[["first_date"]]))
    msg <- "\n\nAs of incidence version 1.6.0, the default behavior has been"
    msg <- paste(msg, "modified so that `first_date` no longer overrides")
    msg <- paste(msg, "`standard`. This means that the first date will be")
    msg <- paste(msg, "either on or before %s.\nIf you want to use %s as the precise")
    msg <- paste(msg, "`first_date`, set `standard = FALSE`.")
    msg <- paste(msg, "To remove this warning in the future,  explicitly set the `standard` argument OR use `options(incidence.warn.first_date = FALSE)`\n", sep = "\n\n")
    warning(sprintf(msg, first_date, fd))
    # turn the warning off so that it's not so noisy
    options(incidence.warn.first_date = FALSE)
  }
  dots <- check_dots(list(...), names(formals(incidence.Date)))
  ## make sure input can be used
  if (!is.logical(standard)) {
    stop("The argument `standard` must be either `TRUE` or `FALSE`.")
  }
  if ("standard" %in% names(dots)) {
    # the user specified iso_week and was given a warning.
    standard <- dots$standard
  }
  out <- make_incidence(dates = dates,
                        interval = interval,
                        groups = groups,
                        na_as_group = na_as_group,
                        first_date = first_date,
                        last_date = last_date,
                        standard = standard,
                        ...)
  if (check_week(interval) && standard) {
    # dates are the first days of corresponding ISOweeks.
    week_start   <- get_week_start(interval)
    out$weeks <- aweek::date2week(out$dates, week_start, floor_day = TRUE)
    if (attr(out$weeks, "week_start") == 1) {
      out$isoweeks <- as.character(out$weeks)
    }
  }

  out
}


#' @export
#' @rdname incidence
incidence.character <- function(dates, interval = 1L, standard = TRUE, groups = NULL,
                           na_as_group = TRUE, first_date = NULL,
                           last_date = NULL, ...) {
  iso_std <- grepl("^[0-9]{4}-[01][0-9]-[0-3][0-9]$", trimws(dates))
  iso_std[is.na(dates)] <- TRUE # prevent false alarms
  if (!all(iso_std)) {
    msg <- paste("Not all dates are in ISO 8601 standard format (yyyy-mm-dd).",
                 "The first incorrect date is %s"
    )
    stop(sprintf(msg, dates[!iso_std][1]))
  }
  dots  <- check_dots(list(...), names(formals(incidence.Date)))
  dates <- check_dates(dates)

  ret <- incidence(as.Date(trimws(dates)),
                   interval = interval,
                   standard = standard,
                   groups = groups,
                   na_as_group = na_as_group,
                   first_date = first_date,
                   last_date = last_date,
                   ...)
  ret
}
## The default incidence is designed for dates provided as integers, and a fixed
## time interval defaulting to 1. 'bins' are time intervals, identified by the
## left date, left-inclusive and right-exclusive, i.e. the time interval defined
## by d1 and d2 is [d1, d2[.

#' @export
#' @rdname incidence

incidence.integer <- function(dates, interval = 1L, groups = NULL,
                              na_as_group = TRUE, first_date = NULL,
                              last_date = NULL, ...) {
  dots <- check_dots(list(...), names(formals(incidence.integer)))
  interval <- valid_interval_integer(interval)
  out <- make_incidence(dates = dates,
                        interval = interval,
                        groups = groups,
                        na_as_group = na_as_group,
                        first_date = first_date,
                        last_date = last_date,
                        ...)
  out$dates    <- as.integer(out$dates)
  out$timespan <- as.integer(out$timespan)
  out$interval <- as.integer(out$interval)
  out
}

#' @export
#' @rdname incidence

incidence.numeric <- function(dates, interval = 1L, groups = NULL,
                              na_as_group = TRUE, first_date = NULL,
                              last_date = NULL, ...) {
  dots <- check_dots(list(...), names(formals(incidence.numeric)))
  interval  <- valid_interval_integer(interval)
  ## make sure input can be used
  out <- make_incidence(dates = dates,
                        interval = interval,
                        groups = groups,
                        na_as_group = na_as_group,
                        first_date = first_date,
                        last_date = last_date,
                        ...)
  out$dates <- as.numeric(out$dates)
  out
}

#' @export
#' @rdname incidence

incidence.POSIXt <- function(dates, interval = 1L, standard = TRUE, groups = NULL,
                             na_as_group = TRUE, first_date = NULL,
                             last_date = NULL, ...) {
  ## make sure input can be used

  dots  <- check_dots(list(...), names(formals(incidence.Date)))
  dates <- check_dates(as.POSIXct(dates))

  ret <- incidence(as.Date(dates),
                   interval = interval,
                   standard = standard,
                   groups = groups,
                   na_as_group = na_as_group,
                   first_date = first_date,
                   last_date = last_date,
                   ...)

  ret$dates <- as.POSIXlt(ret$dates)
  if (inherits(dates, "POSIXct")) {
    ret$dates <- as.POSIXct(ret$dates)
  }
  ret
}

Any scripts or data that you put into this service are public.

incidence documentation built on Nov. 8, 2020, 4:30 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

incidence
Compute, Handle, Plot and Model Incidence of Dated Events

R/incidence.R
In incidence: Compute, Handle, Plot and Model Incidence of Dated Events

Defines functions incidence.POSIXt incidence.numeric incidence.integer incidence.character incidence.Date incidence.default incidence

Documented in incidence incidence.character incidence.Date incidence.default incidence.integer incidence.numeric incidence.POSIXt

Try the incidence package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

incidence Compute, Handle, Plot and Model Incidence of Dated Events

R/incidence.R In incidence: Compute, Handle, Plot and Model Incidence of Dated Events

Defines functions incidence.POSIXt incidence.numeric incidence.integer incidence.character incidence.Date incidence.default incidence

Documented in incidence incidence.character incidence.Date incidence.default incidence.integer incidence.numeric incidence.POSIXt

Try the incidence package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

incidence
Compute, Handle, Plot and Model Incidence of Dated Events

R/incidence.R
In incidence: Compute, Handle, Plot and Model Incidence of Dated Events