group_by.epi_archive: 'group_by' and related methods for 'epi_archive',...
In cmu-delphi/epiprocess: Tools for basic signal processing in epidemiology

group_by.epi_archive

R Documentation

`group_by` and related methods for `epi_archive`, `grouped_epi_archive`

Description

group_by and related methods for epi_archive, grouped_epi_archive

Usage

## S3 method for class 'epi_archive'
group_by(.data, ..., .add = FALSE, .drop = dplyr::group_by_drop_default(.data))

## S3 method for class 'grouped_epi_archive'
group_by(.data, ..., .add = FALSE, .drop = dplyr::group_by_drop_default(.data))

## S3 method for class 'grouped_epi_archive'
group_by_drop_default(.tbl)

## S3 method for class 'grouped_epi_archive'
group_vars(x)

## S3 method for class 'grouped_epi_archive'
groups(x)

## S3 method for class 'grouped_epi_archive'
ungroup(x, ...)

is_grouped_epi_archive(x)

Arguments

`.data`	An `epi_archive` or `grouped_epi_archive`
`...`	Similar to `dplyr::group_by` (see "Details:" for edge cases); For `group_by`: unquoted variable name(s) or other "data masking" expression(s). It's possible to use `dplyr::mutate`-like syntax here to calculate new columns on which to perform grouping, but note that, if you are regrouping an already-grouped `.data` object, the calculations will be carried out ignoring such grouping (same as in dplyr). For `ungroup`: either empty, in order to remove the grouping and output an `epi_archive`; or variable name(s) or other "tidy-select" expression(s), in order to remove the matching variables from the list of grouping variables, and output another `grouped_epi_archive`.
`.add`	Boolean. If `FALSE`, the default, the output will be grouped by the variable selection from `...` only; if `TRUE`, the output will be grouped by the current grouping variables plus the variable selection from `...`.
`.drop`	As described in `dplyr::group_by`; determines treatment of factor columns.
`.tbl`	A `grouped_epi_archive` object.
`x`	For `groups`, `group_vars`, or `ungroup`: a `grouped_epi_archive`; for `is_grouped_epi_archive`: any object

Details

To match dplyr, group_by allows "data masking" (also referred to as "tidy evaluation") expressions ..., not just column names, in a way similar to mutate. Note that replacing or removing key columns with these expressions is disabled.

archive %>% group_by() and other expressions that group or regroup by zero columns (indicating that all rows should be treated as part of one large group) will output a grouped_epi_archive, in order to enable the use of grouped_epi_archive methods on the result. This is in slight contrast to the same operations on tibbles and grouped tibbles, which will not output a grouped_df in these circumstances.

Using group_by with .add=FALSE to override the existing grouping is disabled; instead, ungroup first then group_by.

group_by_drop_default on (ungrouped) epi_archives is expected to dispatch to group_by_drop_default.default (but there is a dedicated method for grouped_epi_archives).

Examples


grouped_archive <- archive_cases_dv_subset %>% group_by(geo_value)

# `print` for metadata and method listing:
grouped_archive %>% print()

# The primary use for grouping is to perform a grouped `epix_slide`:

archive_cases_dv_subset %>%
  group_by(geo_value) %>%
  epix_slide(
    .f = ~ mean(.x$case_rate_7d_av),
    .before = 2,
    .versions = as.Date("2020-06-11") + 0:2,
    .new_col_name = "case_rate_3d_av"
  ) %>%
  ungroup()

# -----------------------------------------------------------------

# Advanced: some other features of dplyr grouping are implemented:

library(dplyr)
toy_archive <-
  tribble(
    ~geo_value, ~age_group, ~time_value, ~version, ~value,
    "us", "adult", "2000-01-01", "2000-01-02", 121,
    "us", "pediatric", "2000-01-02", "2000-01-03", 5, # (addition)
    "us", "adult", "2000-01-01", "2000-01-03", 125, # (revision)
    "us", "adult", "2000-01-02", "2000-01-03", 130 # (addition)
  ) %>%
  mutate(
    age_group = ordered(age_group, c("pediatric", "adult")),
    time_value = as.Date(time_value),
    version = as.Date(version)
  ) %>%
  as_epi_archive(other_keys = "age_group")

# The following are equivalent:
toy_archive %>% group_by(geo_value, age_group)
toy_archive %>%
  group_by(geo_value) %>%
  group_by(age_group, .add = TRUE)
grouping_cols <- c("geo_value", "age_group")
toy_archive %>% group_by(across(all_of(grouping_cols)))

# And these are equivalent:
toy_archive %>% group_by(geo_value)
toy_archive %>%
  group_by(geo_value, age_group) %>%
  ungroup(age_group)

# To get the grouping variable names as a character vector:
toy_archive %>%
  group_by(geo_value) %>%
  group_vars()

# To get the grouping variable names as a `list` of `name`s (a.k.a. symbols):
toy_archive %>%
  group_by(geo_value) %>%
  groups()

toy_archive %>%
  group_by(geo_value, age_group, .drop = FALSE) %>%
  epix_slide(.f = ~ sum(.x$value), .before = 20) %>%
  ungroup()

cmu-delphi/epiprocess documentation built on April 12, 2025, 12:51 p.m.