add_count.corpus: Add count of observations to corpus

View source: R/count.R

add_count.corpusR Documentation

Add count of observations to corpus

Description

add_count() and add_tally() are wrappers around dplyr::add_count() and dplyr::add_tally() that add a new document variable with the number of observations. add_count() is a shortcut for group_by() + add_tally().

Usage

## S3 method for class 'corpus'
add_count(x, ..., wt = NULL, sort = FALSE, name = NULL, .drop = NULL)

## S3 method for class 'corpus'
add_tally(x, ..., wt = NULL, sort = FALSE, name = NULL)

Arguments

x

a quanteda corpus object

...

for add_count(), document variables to group by; for add_tally(), additional arguments passed to the method

wt

frequency weights. Can be NULL or a variable:

  • If NULL (the default), counts the number of rows in each group

  • If a variable, computes sum(wt) for each group

sort

if TRUE, will sort output in descending order of n

name

the name of the new column in the output. If omitted, it will default to n. If there's already a column called n, it will error, and require you to specify the name.

.drop

not used for corpus objects; included for compatibility with the generic

Value

a corpus with an additional document variable containing counts

Examples

# Count documents by President and add as a variable
data_corpus_inaugural %>%
  add_count(President) %>%
  summary(n = 10)

# Add total count to each document
data_corpus_inaugural %>%
  head() %>%
  add_tally() %>%
  summary()

# Count by multiple variables
data_corpus_inaugural %>%
  add_count(Party, President) %>%
  summary(n = 10)

# Use custom name
data_corpus_inaugural %>%
  add_count(Party, name = "party_count") %>%
  summary(n = 10)

# Add tally to show total count
data_corpus_inaugural %>%
  slice(1:6) %>%
  add_tally() %>%
  summary()

quanteda.tidy documentation built on Dec. 17, 2025, 5:09 p.m.