Most data operations are done on groups defined by variables.
group_by() takes an existing tbl and converts it into a grouped tbl
where operations are performed "by group".
ungroup() removes grouping.
1 2 3
Variables to group by. All tbls accept variable names. Some tbls will accept functions of variables. Duplicated groups will be silently dropped.
A grouped data frame, unless the combination of
yields a non empty set of grouping columns, a regular (ungrouped) data frame
group_by() is an S3 generic with methods for the three built-in
tbls. See the help for the corresponding classes and their manip
methods for more details:
The three scoped variants (
group_by_at()) make it easy to group a dataset by a selection of
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
by_cyl <- mtcars %>% group_by(cyl) # grouping doesn't change how the data looks (apart from listing # how it's grouped): by_cyl # It changes how it acts with the other dplyr verbs: by_cyl %>% summarise( disp = mean(disp), hp = mean(hp) ) by_cyl %>% filter(disp == max(disp)) # Each call to summarise() removes a layer of grouping by_vs_am <- mtcars %>% group_by(vs, am) by_vs <- by_vs_am %>% summarise(n = n()) by_vs by_vs %>% summarise(n = sum(n)) # To removing grouping, use ungroup by_vs %>% ungroup() %>% summarise(n = sum(n)) # You can group by expressions: this is just short-hand for # a mutate/rename followed by a simple group_by mtcars %>% group_by(vsam = vs + am) # By default, group_by overrides existing grouping by_cyl %>% group_by(vs, am) %>% group_vars() # Use add = TRUE to instead append by_cyl %>% group_by(vs, am, add = TRUE) %>% group_vars() # when factors are involved, groups can be empty tbl <- tibble( x = 1:10, y = factor(rep(c("a", "c"), each = 5), levels = c("a", "b", "c")) ) tbl %>% group_by(y) %>% group_rows()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.