distinct.corpus: Subset documents distinct/unique by document variables

View source: R/distinct.R

distinct.corpusR Documentation

Subset documents distinct/unique by document variables

Description

Select only documents that are unique/distinct with respect to values of their document variables.

Usage

## S3 method for class 'corpus'
distinct(.data, ..., .keep_all = FALSE)

Arguments

.data

a corpus object with document variables

...

comma-separated list of unquoted document variables, or expressions involving document variables

.keep_all

If TRUE, keep all variables in .data. If a combination of ... is not distinct, this keeps the first row of values.

Examples

distinct(data_corpus_inaugural[1:5], President) %>%
  summary()
distinct(data_corpus_inaugural[1:5], President, .keep_all = TRUE) %>%
  summary()

quanteda/quanteda.tidy documentation built on April 5, 2025, 2:50 p.m.