summarize.: Aggregate data using summary statistics

View source: R/summarize.R

summarize.R Documentation

Aggregate data using summary statistics


Aggregate data using summary statistics such as mean or median. Can be calculated by group.


  .by = NULL,
  .sort = TRUE,
  .groups = "drop_last",
  .unpack = FALSE



A data.frame or data.table


Aggregations to perform


Columns to group by.

  • A single column can be passed with .by = d.

  • Multiple columns can be passed with .by = c(c, d)

  • tidyselect can be used:

    • Single predicate: .by = where(is.character)

    • Multiple predicates: .by = c(where(is.character), where(is.factor))

    • A combination of predicates and column names: .by = c(where(is.character), b)


experimental: Default TRUE. If FALSE the original order of the grouping variables will be preserved.


Grouping structure of the result

  • "drop_last": Drop the last level of grouping

  • "drop": Drop all groups

  • "keep": Keep all groups


experimental: Default FALSE. Should unnamed data frame inputs be unpacked. The user must opt in to this option as it can lead to a reduction in performance.


df <- data.table(
  a = 1:3,
  b = 4:6,
  c = c("a", "a", "b"),
  d = c("a", "a", "b")

df %>%
  summarize(avg_a = mean(a),
            max_b = max(b),
            .by = c)

df %>%
  summarize(avg_a = mean(a),
            .by = c(c, d))

tidytable documentation built on Sept. 23, 2022, 5:10 p.m.