plot.vocabularyComparison: visualize vocabularyComparison
In corpustools: Managing, Querying and Analyzing Tokenized Text

plot.vocabularyComparison

R Documentation

visualize vocabularyComparison

Description

visualize vocabularyComparison

Usage

## S3 method for class 'vocabularyComparison'
plot(
  x,
  n = 25,
  mode = c("both", "ratio_x", "ratio_y"),
  balance = T,
  size = c("chi2", "freq", "ratio"),
  ...
)

Arguments

`x`	a vocabularyComparison object, created with the compare_corpus or compare_subset method
`n`	the number of words in the plot
`mode`	use "both" to plot both overrepresented and underrepresented words using the plot_words function. Whether a term is under- or overrepresented is indicated on the x-axis, which shows the log ratios (negative is underrepresented, positive is overrepresented). Use "ratio_x" or "ratio_y" to only plot overrepresented or underrepresented words using dtm_wordcloud
`balance`	if TRUE, get an equal amount of terms on the left (underrepresented) and right (overrepresented) side. If FALSE, the top chi words are used, regardless of ratio.
`size`	use "freq", "chi2" or "ratio" for determining the size of words
`...`	additional arguments passed to plot_words ("both" mode) or dtm_wordcloud (ratio modes)

Examples

## as example, compare SOTU paragraphs about taxes to rest
tc = create_tcorpus(sotu_texts[1:100,], doc_column = 'id')
comp = compare_subset(tc, 'token', query_x = 'tax*')


plot(comp, balance=TRUE)
plot(comp, mode = 'ratio_x')
plot(comp, mode = 'ratio_y')

corpustools documentation built on Aug. 8, 2025, 6:08 p.m.