textstat_summary: Summarize documents

Description Usage Arguments Details Examples

View source: R/textstat_summary.R

Description

Count the total number of number tokens and sentences.

Usage

1
textstat_summary(x, cache = TRUE, ...)

Arguments

x

corpus to be summarized

cache

if TRUE, use internal cache from the second time. Not available on Solaris.

...

additional arguments passed through to dfm()

Details

Count the total number of characters, tokens and sentences as well as special tokens such as numbers, punctuation marks, symbols, tags and emojis.

Examples

1
2
3
4
5
6
corp <- data_corpus_inaugural
textstat_summary(corp, cache = TRUE)
toks <- tokens(corp)
textstat_summary(toks, cache = TRUE)
dfmat <- dfm(toks)
textstat_summary(dfmat, cache = TRUE)

koheiw/quanteda.core documentation built on Sept. 21, 2020, 3:44 p.m.