incidence: Compute the incidence of events

Description Usage Arguments Value Note Examples

View source: R/incidence.R

Description

Compute the incidence of events

Usage

1
2
3
4
5
6
7
8
9
incidence(
  x,
  date_index,
  groups = NULL,
  interval = 1L,
  na_as_group = TRUE,
  counts = NULL,
  firstdate = NULL
)

Arguments

x

A data frame representing a linelist (or potentially a pre-aggregated dataset).

date_index

The time index(es) of the given data. This should be the name(s) corresponding to the desired date column(s) in x of class: integer, numeric, Date, POSIXct, POSIXlt, and character. (See Note about numeric and character formats). Multiple inputs only make sense when x is a linelist, and in this situation, to avoid ambiguity, the vector must be named. These names will be used for the resultant count columns.

groups

An optional vector giving the names of the groups of observations for which incidence should be grouped.

interval

An integer or character indicating the (fixed) size of the time interval used for computing the incidence; defaults to 1 day. This can also be a text string that corresponds to a valid date interval, e.g.

* (x) day(s)
* (x) weeks(s)
* (x) epiweeks(s)
* (x) isoweeks(s)
* (x) months(s)
* (x) quarter(s)
* (x) years(s)

More details can be found in the "Interval specification" and "Week intervals" sections below.

na_as_group

A logical value indicating if missing group values (NA) should treated as a separate category (TRUE) or removed from consideration (FALSE). Defaults to TRUE.

counts

The count variables of the given data. If NULL (default) the data is taken to be a linelist of individual observations.

firstdate

When the interval is numeric or in days/months and has a numeric prefix greater than 1, then you can optionally specify the date that you wish to anchor your intervals to begin from. If NULL (default) then the intervals will start at the minimum value contained in the date_index column. Note that the class of firstdate must be Date if the date_index column is Date, POSIXct, POSIXlt, or character and integer otherwise.

Value

An incidence2 object. This is a subclass of incidence_df and aggregated count of observations grouped according to the specified interval and, optionally, the given groups. By default it will contain the following columns:

Note

Input data (date_index)

Interval specification (interval)

incidence() uses the grates package to generate date groupings. The grouping used depends on the value of interval. This can be specified as either an integer value or a more standard specification such as "day", "week", "month", "quarter" or "year". The format in this situation is similar to that used by seq.Date() where these values can optionally be preceded by a (positive or negative) integer and a space, or followed by "s". When no prefix is given:

When a prefix is provided (e.g. 2 weeks) the output is an object of class "period" (see as_period()). Note that for the values "month", "quarter" and "year" intervals are always chosen to start at the beginning of the calendar equivalent. If the input is an integer value the input is treated as if it was specified in days (i.e. 2 and 2 days) produce the same output.

The only interval values that do not produce these grouped classes are 1, 1L, "day" or "days" (both without prefix) are used. In this situation the returned object is of the standard "Date" class.

Week intervals

It is possible to construct incidence objects standardized to any day of the week. The default state is to use ISO 8601 definition of weeks, which start on Monday. You can specify the day of the week an incidence object should be standardised to by using the pattern "n W weeks" where "W" represents the weekday in an English or current locale and "n" represents the duration, but this can be ommitted. Below are examples of specifying weeks starting on different days assuming we had data that started on 2016-09-05, which is ISO week 36 of 2016:

It's also possible to use something like "3 weeks: Saturday"; In addition, there are keywords reserved for specific days of the week:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
if (requireNamespace("outbreaks", quietly = TRUE)) {
  withAutoprint({
    data(ebola_sim_clean, package = "outbreaks")
    dat <- ebola_sim_clean$linelist

    # daily incidence
    incidence(dat, date_of_onset)

    # weekly incidence
    incidence(dat, date_of_onset, interval = "week")

    # starting on a Monday
    incidence(dat, date_of_onset, interval = "isoweek")

    # starting on a Sunday
    incidence(dat, date_of_onset, interval = "epiweek")

    # group by gender
    incidence(dat, date_of_onset, interval = 7, groups = gender)

    # group by gender and hospital
    incidence(dat, date_of_onset, interval = "2 weeks", groups = c(gender, hospital))
  })
}

# use of first_date
dat <- data.frame(dates = Sys.Date() + sample(-3:10, 10, replace = TRUE))
incidence(dat, dates, interval = "week", firstdate = Sys.Date() + 1)

incidence2 documentation built on July 15, 2021, 1:06 a.m.