group_by: Group by one or more variables

Description Usage Arguments Value Tbl types Scoped grouping See Also Examples

View source: R/group-by.r

Description

Most data operations are done on groups defined by variables. group_by() takes an existing tbl and converts it into a grouped tbl where operations are performed "by group". ungroup() removes grouping.

Usage

1
2
3
group_by(.data, ..., add = FALSE)

ungroup(x, ...)

Arguments

.data

a tbl

...

Variables to group by. All tbls accept variable names. Some tbls will accept functions of variables. Duplicated groups will be silently dropped.

add

When add = FALSE, the default, group_by() will override existing groups. To add to the existing groups, use add = TRUE.

x

A tbl()

Value

A grouped data frame, unless the combination of ... and add yields a non empty set of grouping columns, a regular (ungrouped) data frame otherwise.

Tbl types

group_by() is an S3 generic with methods for the three built-in tbls. See the help for the corresponding classes and their manip methods for more details:

Scoped grouping

The three scoped variants (group_by_all(), group_by_if() and group_by_at()) make it easy to group a dataset by a selection of variables.

See Also

Other grouping functions: group_by_all, group_indices, group_keys, group_map, group_nest, group_rows, group_size, group_trim, groups

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
by_cyl <- mtcars %>% group_by(cyl)

# grouping doesn't change how the data looks (apart from listing
# how it's grouped):
by_cyl

# It changes how it acts with the other dplyr verbs:
by_cyl %>% summarise(
  disp = mean(disp),
  hp = mean(hp)
)
by_cyl %>% filter(disp == max(disp))

# Each call to summarise() removes a layer of grouping
by_vs_am <- mtcars %>% group_by(vs, am)
by_vs <- by_vs_am %>% summarise(n = n())
by_vs
by_vs %>% summarise(n = sum(n))

# To removing grouping, use ungroup
by_vs %>%
  ungroup() %>%
  summarise(n = sum(n))

# You can group by expressions: this is just short-hand for
# a mutate/rename followed by a simple group_by
mtcars %>% group_by(vsam = vs + am)

# By default, group_by overrides existing grouping
by_cyl %>%
  group_by(vs, am) %>%
  group_vars()

# Use add = TRUE to instead append
by_cyl %>%
  group_by(vs, am, add = TRUE) %>%
  group_vars()

# when factors are involved, groups can be empty
tbl <- tibble(
  x = 1:10,
  y = factor(rep(c("a", "c"), each  = 5), levels = c("a", "b", "c"))
)
tbl %>%
  group_by(y) %>%
  group_rows()

tidyverse/dplyr documentation built on Jan. 11, 2019, 11:08 a.m.