fcount: A fast replacement to dplyr::count()
In timeplyr: Fast Tidy Tools for Date and Date-Time Manipulation

fcount

R Documentation

A fast replacement to dplyr::count()

Description

Near-identical alternative to dplyr::count().

Usage

fcount(
  data,
  ...,
  wt = NULL,
  sort = FALSE,
  order = df_group_by_order_default(data),
  name = NULL,
  .by = NULL,
  .cols = NULL
)

fadd_count(
  data,
  ...,
  wt = NULL,
  sort = FALSE,
  order = df_group_by_order_default(data),
  name = NULL,
  .by = NULL,
  .cols = NULL
)

Arguments

`data`	A data frame.
`...`	Variables to group by.
`wt`	Frequency weights. Can be `NULL` or a variable: If `NULL` (the default), counts the number of rows in each group. If a variable, computes `sum(wt)` for each group.
`sort`	If `TRUE`, will show the largest groups at the top.
`order`	Should the groups be calculated as ordered groups? If `FALSE`, this will return the groups in order of first appearance, and in many cases is faster. If `TRUE` (the default), the groups are returned in sorted order, exactly the same way as `dplyr::count`.
`name`	The name of the new column in the output. If there's already a column called `n`, it will use `nn`. If there's a column called `n` and `n`n, it'll use `nnn`, and so on, adding `n`s until it gets a new name.
`.by`	(Optional). A selection of columns to group by for this operation. Columns are specified using tidy-select.
`.cols`	(Optional) alternative to `...` that accepts a named character vector or numeric vector. If speed is an expensive resource, it is recommended to use this.

Details

This is a fast and near-identical alternative to dplyr::count() using the collapse package. Unlike collapse::fcount(), this works very similarly to dplyr::count(). The only main difference is that anything supplied to wt is recycled and added as a data variable. Other than that everything works exactly as the dplyr equivalent.

fcount() and fadd_count() can be up to >100x faster than the dplyr equivalents.

Value

A data.frame of frequency counts by group.

Examples

library(timeplyr)
library(dplyr)

iris %>%
  fcount()
iris %>%
  fadd_count(name = "count") %>%
  fslice_head(n = 10)
iris %>%
  group_by(Species) %>%
  fcount()
iris %>%
  fcount(Species)
iris %>%
  fcount(across(where(is.numeric), mean))

### Sorting behaviour

# Sorted by group
starwars %>%
  fcount(hair_color)
# Sorted by frequency
starwars %>%
  fcount(hair_color, sort = TRUE)
# Groups sorted by order of first appearance (faster)
starwars %>%
  fcount(hair_color, order = FALSE)

timeplyr documentation built on Sept. 12, 2024, 7:37 a.m.