count_tcorpus: Count results of search hits, or of a given feature in tokens

View source: R/aggregate.r

count_tcorpusR Documentation

Count results of search hits, or of a given feature in tokens

Description

Count results of search hits, or of a given feature in tokens

Usage

count_tcorpus(
  tc,
  meta_cols = NULL,
  hits = NULL,
  feature = NULL,
  count = c("documents", "tokens", "hits"),
  wide = T
)

Arguments

tc

A tCorpus

meta_cols

The columns in the meta data by which the results should be grouped

hits

featureHits or contextHits (output of search_features, search_dictionary or search_contexts)

feature

Instead of hits, a specific feature column can be selected.

count

How should the results be counted? Number of documents, tokens, or unique hits. The difference between tokens and hits is that hits can encompass multiple tokens (e.g., "Bob Smith" is 1 hit and 2 tokens).

wide

Should results be in wide or long format?

Value

A data table

Examples


tc = create_tcorpus(sotu_texts, doc_col='id')
hits = search_features(tc, c("US# <united states>", "Economy# econom*"))
count_tcorpus(tc, hits=hits)
count_tcorpus(tc, hits=hits, meta_cols='president')
count_tcorpus(tc, hits=hits, meta_cols='president', wide=FALSE)


corpustools documentation built on May 31, 2023, 8:45 p.m.