fcount: Efficiently Count Observations by Group
In collapse: Advanced and Fast Data Transformation

fcount

R Documentation

Efficiently Count Observations by Group

Description

A much faster replacement for dplyr::count.

Usage

fcount(x, ..., w = NULL, name = "N", add = FALSE,
      sort = FALSE, decreasing = FALSE)

fcountv(x, cols = NULL, w = NULL, name = "N", add = FALSE,
        sort = FALSE, ...)

Arguments

`x`	a data frame or list-like object, including 'grouped_df' or 'indexed_frame'. Atomic vectors or matrices can also be passed, but will be sent through `qDF`.
`...`	for `fcount`: names or sequences of columns to count cases by - passed to `fselect`. For `fcountv`: further arguments passed to `GRP` (such as `decreasing`, `na.last`, `method`, `effect` etc.). Leaving this empty will count on all columns.
`cols`	select columns to count cases by, using column names, indices, a logical vector or a selector function (e.g. `is_categorical`).
`w`	a numeric vector of weights, may contain missing values. In `fcount` this can also be the (unquoted) name of a column in the data frame. `fcountv` also supports a single character name. Note that the corresponding argument in `dplyr::count` is called `wt`, but collapse has a global default for weights arguments to be called `w`.
`name`	character. The name of the column containing the count or sum of weights. `dplyr::count` it is called `"n"`, but `"N"` is more consistent with the rest of collapse and data.table.
`add`	`TRUE` adds the count column to `x`. Alternatively `add = "group_vars"` (or `add = "gv"` for parsimony) can be used to retain only the variables selected for counting in `x` and the count.
`sort`, `decreasing`	arguments passed to `GRP` affecting the order of rows in the output (if `add = FALSE`), and the algorithm used for counting. In general, `sort = FALSE` is faster unless data is already sorted by the columns used for counting.

Value

If x is a list, an object of the same type as x with a column (name) added at the end giving the count. Otherwise, if x is atomic, a data frame returned from qDF(x) with the count column added. By default (add = FALSE) only the unique rows of x of the columns used for counting are returned.

Examples

fcount(mtcars, cyl, vs, am)
fcountv(mtcars, cols = .c(cyl, vs, am))
fcount(mtcars, cyl, vs, am, sort = TRUE)
fcount(mtcars, cyl, vs, am, add = TRUE)
fcount(mtcars, cyl, vs, am, add = "group_vars")

## With grouped data
mtcars |> fgroup_by(cyl, vs, am) |> fcount()
mtcars |> fgroup_by(cyl, vs, am) |> fcount(add = TRUE)
mtcars |> fgroup_by(cyl, vs, am) |> fcount(add = "group_vars")

## With indexed data: by default counting on the first index variable
wlddev |> findex_by(country, year) |> fcount()
wlddev |> findex_by(country, year) |> fcount(add = TRUE)
# Use fcountv to pass additional arguments to GRP.pdata.frame,
# here using the effect argument to choose a different index variable
wlddev |> findex_by(country, year) |> fcountv(effect = "year")
wlddev |> findex_by(country, year) |> fcountv(add = "group_vars", effect = "year")

collapse documentation built on June 10, 2025, 9:12 a.m.