dfCount: Count number of rows per group
In daattali/rsalad: A mix of useful R functions that are good for you

Description Usage Arguments Details Value Performance Note See Also Examples

Count how many times each distinct value of a data.frame column is observed.

1	dfCount(df, col, sort = TRUE, name = "total")

`df`	A data.frame.
`col`	The column to count.
`sort`	Whether or not to sort the resulting total column.
`name`	The name of the total column.

dfCount(x, "y") is similar in functionality to table(x$y), but performs better on large datasets (according to my not-so-thorough testing).

There are two main differences between dfCount and table:

1. dfCount returns a data.frame instead of table object

2. dfCount includes a row for number of NA observations, whereas table does not by default

A data.frame with two columns: The first column is the distinct values of the given variable, the second column shows the total number of rows with that value.

This function performs much faster than its equivalent table call on large datasets, even though the table function does not sort the results. The main speed boost is due to the fact that 'dplyr' is used.

For example, with the following data.frame

df <- data.frame(a = rep(1:50, 100000))

running dfCount(df, "a") on my machine 50 times is, on average, 10x faster than table(df$a) (217 milliseconds vs 2112 milliseconds).

See the package vignette for more benchmarking analysis.

The dplyr package is required for this function.

plotCount

if (requireNamespace("nycflights13", quietly = TRUE)) {
  flights <- nycflights13::flights
  dfCount(flights, "dest")
  dfCount(flights, "dest", sort = FALSE)
  dfCount(flights, "dest", name = "flights")
}

dfCount(infert, "education")
dfCount(infert, "education", sort = FALSE)
data.frame(table(infert$education))