summarize: Aggregate data using summary statistics
In tidytable: Tidy Interface to 'data.table'

summarize

R Documentation

Aggregate data using summary statistics

Description

Aggregate data using summary statistics such as mean or median. Can be calculated by group.

Usage

summarize(
  .df,
  ...,
  .by = NULL,
  .sort = TRUE,
  .groups = "drop_last",
  .unpack = FALSE
)

summarise(
  .df,
  ...,
  .by = NULL,
  .sort = TRUE,
  .groups = "drop_last",
  .unpack = FALSE
)

Arguments

`.df`	A data.frame or data.table
`...`	Aggregations to perform
`.by`	Columns to group by. A single column can be passed with `.by = d`. Multiple columns can be passed with `.by = c(c, d)` `tidyselect` can be used: Single predicate: `.by = where(is.character)` Multiple predicates: `.by = c(where(is.character), where(is.factor))` A combination of predicates and column names: `.by = c(where(is.character), b)`
`.sort`	experimental: Default `TRUE`. If FALSE the original order of the grouping variables will be preserved.
`.groups`	Grouping structure of the result "drop_last": Drop the last level of grouping "drop": Drop all groups "keep": Keep all groups
`.unpack`	experimental: Default `FALSE`. Should unnamed data frame inputs be unpacked. The user must opt in to this option as it can lead to a reduction in performance.

Examples

df <- data.table(
  a = 1:3,
  b = 4:6,
  c = c("a", "a", "b"),
  d = c("a", "a", "b")
)

df %>%
  summarize(avg_a = mean(a),
            max_b = max(b),
            .by = c)

df %>%
  summarize(avg_a = mean(a),
            .by = c(c, d))

tidytable documentation built on Sept. 11, 2024, 8:05 p.m.