tally: Count/tally observations by group In tidyverse/dplyr: A Grammar of Data Manipulation

Description

`tally()` is a convenient wrapper for summarise that will either call `n()` or `sum(n)` depending on whether you're tallying for the first time, or re-tallying. `count()` is similar but calls `group_by()` before and `ungroup()` after.

`add_tally()` adds a column `n` to a table based on the number of items within each existing group, while `add_count()` is a shortcut that does the grouping as well. These functions are to `tally()` and `count()` as `mutate()` is to `summarise()`: they add an additional column rather than collapsing each group.

Usage

 ```1 2 3 4 5 6 7``` ```tally(x, wt, sort = FALSE, name = "n") count(x, ..., wt = NULL, sort = FALSE, name = "n") add_tally(x, wt, sort = FALSE, name = "n") add_count(x, ..., wt = NULL, sort = FALSE, name = "n") ```

Arguments

 `x` a `tbl()` to tally/count. `wt` (Optional) If omitted (and no variable named `n` exists in the data), will count the number of rows. If specified, will perform a "weighted" tally by summing the (non-missing) values of variable `wt`. A column named `n` (but not `nn` or `nnn`) will be used as weighting variable by default in `tally()`, but not in `count()`. This argument is automatically quoted and later evaluated in the context of the data frame. It supports unquoting. See `vignette("programming")` for an introduction to these concepts. `sort` if `TRUE` will sort output in descending order of `n` `name` The output column name. If omitted, it will be `n`. `...` Variables to group by.

Value

A tbl, grouped the same way as `x`.

Note

The column name in the returned data is usually `n`, even if you have supplied a weight.

If the data already has a column named `n`, the output column will be called `nn`. If the table already has columns called `n` and `nn` then the column returned will be `nnn`, and so on.

To control the output column name use `name`.

Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32``` ```# tally() is short-hand for summarise() mtcars %>% tally() mtcars %>% group_by(cyl) %>% tally() # count() is a short-hand for group_by() + tally() mtcars %>% count(cyl) # add_tally() is short-hand for mutate() mtcars %>% add_tally() # add_count() is a short-hand for group_by() + add_tally() mtcars %>% add_count(cyl) # count() and tally() are designed so that you can call # them repeatedly, each time rolling up a level of detail species <- starwars %>% count(species, homeworld, sort = TRUE) species species %>% count(species, sort = TRUE) # Change the name of the newly created column: species <- starwars %>% count(species, homeworld, sort = TRUE, name = "n_species_by_homeworld") species species %>% count(species, sort = TRUE, name = "n_species") # add_count() is useful for groupwise filtering # e.g.: show details for species that have a single member starwars %>% add_count(species) %>% filter(n == 1) ```

tidyverse/dplyr documentation built on Jan. 11, 2019, 11:08 a.m.