Description Usage Arguments Details Examples
View source: R/textstat_summary.R
Count the total number of number tokens and sentences.
1 | textstat_summary(x, cache = TRUE, ...)
|
x |
corpus to be summarized |
cache |
if |
... |
additional arguments passed through to |
Count the total number of characters, tokens and sentences as well as special tokens such as numbers, punctuation marks, symbols, tags and emojis.
chars = number of characters; equal to nchar()
sents
= number of sentences; equal ntoken(tokens(x), what = "sentence")
tokens = number of tokens; equal to ntoken()
types = number of unique
tokens; equal to ntype()
puncts = number of punctuation marks
(^\p{P}+$
)
numbers = number of numeric tokens
(^\p{Sc}{0,1}\p{N}+([.,]*\p{N})*\p{Sc}{0,1}$
)
symbols = number of
symbols (^\p{S}$
)
tags = number of tags; sum of pattern_username
and pattern_hashtag
in quanteda_options()
emojis = number of emojis
(^\p{Emoji_Presentation}+$
)
1 2 3 4 5 6 | corp <- data_corpus_inaugural
textstat_summary(corp, cache = TRUE)
toks <- tokens(corp)
textstat_summary(toks, cache = TRUE)
dfmat <- dfm(toks)
textstat_summary(dfmat, cache = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.