bin_by_date | R Documentation |
Aggregates data by specified time periods (e.g., weeks, months) and calculates (weighted)
counts. Incidence rates are also calculated using the provided population numbers.
This function is the core date binning engine
used by geom_epicurve()
and stat_bin_date()
for creating epidemiological
time series visualizations.
bin_by_date(
x,
dates_from,
n = 1,
population = 1,
fill_gaps = FALSE,
date_resolution = "week",
week_start = 1,
.groups = "drop"
)
x |
Either a data frame with a date column, or a date vector.
|
dates_from |
Column name containing the dates to bin. Used when x is a data.frame. |
n |
Numeric column with case counts (or weights). Supports quoted and unquoted column names. |
population |
A number or a numeric column with the population size. Used to calculate the incidence. |
fill_gaps |
Logical; If |
date_resolution |
Character string specifying the time unit for date aggregation.
Possible values include:
|
week_start |
Integer specifying the start of the week (1 = Monday, 7 = Sunday).
Only used when |
.groups |
See |
The function performs several key operations:
Date coercion: Converts the date column to proper Date format
Gap filling (optional): Generates complete temporal sequences to fill missing time periods with zeros
Date binning: Rounds dates to the specified resolution using lubridate::floor_date()
Weight and population handling: Processes count weights and population denominators
Aggregation: Groups by binned dates and sums weights to get counts and incidence
Grouping behaviour: The function respects existing grouping in the input data frame.
A data frame with the following columns:
A date column with the same name as dates_from
, where values are binned to the start of the specified time period.
n
: Count of observations (sum of weights) for each time period
incidence
: Incidence rate calculated as n / population
for each time period
Any existing grouping variables are preserved
library(dplyr)
# Create sample data
outbreak_data <- data.frame(
onset_date = as.Date("2024-12-10") + sample(0:100, 50, replace = TRUE),
cases = sample(1:5, 50, replace = TRUE)
)
# Basic weekly binning
bin_by_date(outbreak_data, dates_from = onset_date)
# Weekly binning with case weights
bin_by_date(outbreak_data, onset_date, n = cases)
# Monthly binning
bin_by_date(outbreak_data, onset_date,
date_resolution = "month"
)
# ISO week binning (Monday start)
bin_by_date(outbreak_data, onset_date,
date_resolution = "isoweek"
) |>
mutate(date_formatted = strftime(onset_date, "%G-W%V")) # Add correct date labels
# US CDC epiweek binning (Sunday start)
bin_by_date(outbreak_data, onset_date,
date_resolution = "epiweek"
)
# With population data for incidence calculation
outbreak_data$population <- 10000
bin_by_date(outbreak_data, onset_date,
n = cases,
population = population
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.