plotTopic: Plotting Counts of Topics over Time (Relative to Corpus)

Description Usage Arguments Value Examples

View source: R/plotTopic.R

Description

Creates a plot of the counts/proportion of specified topics of a result of LDAgen. There is an option to plot all curves in one plot or to create one plot for every curve (see pages). In addition the plots can be written to a pdf by setting file.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
plotTopic(
  object,
  ldaresult,
  ldaID,
  select = 1:nrow(ldaresult$document_sums),
  tnames,
  rel = FALSE,
  mark = TRUE,
  unit = "month",
  curves = c("exact", "smooth", "both"),
  smooth = 0.05,
  main,
  xlab,
  ylim,
  ylab,
  both.lwd,
  both.lty,
  col,
  legend = ifelse(pages, "onlyLast:topright", "topright"),
  pages = FALSE,
  natozero = TRUE,
  file,
  ...
)

Arguments

object

textmeta object with strictly tokenized text component (character vectors) - such as a result of cleanTexts

ldaresult

The result of a function call LDAgen

ldaID

Character vector of IDs of the documents in ldaresult

select

Integer: Which topics of ldaresult should be plotted (default: all topics)?

tnames

Character vector of same length as select - labels for the topics (default are the first returned words of top.topic.words from the lda package for each topic)

rel

Logical: Should counts (FALSE) or proportion (TRUE) be plotted (default: FALSE)?

mark

Logical: Should years be marked by vertical lines (default: TRUE)?

unit

Character: To which unit should dates be floored (default: "month")? Other possible units are "bimonth", "quarter", "season", "halfyear", "year", for more units see round_date

curves

Character: Should "exact", "smooth" curve or "both" be plotted (default: "exact")?

smooth

Numeric: Smoothing parameter which is handed over to lowess as f (default: 0.05)

main

Character: Graphical parameter

xlab

Character: Graphical parameter

ylim

Graphical parameter

ylab

Character: Graphical parameter

both.lwd

Graphical parameter for smoothed values if curves = "both"

both.lty

Graphical parameter for smoothed values if curves = "both"

col

Graphical parameter, could be a vector. If curves = "both" the function will for every topicgroup plot at first the exact and then the smoothed curve - this is important for your col order.

legend

Character: Value(s) to specify the legend coordinates (default: "topright", "onlyLast:topright" for pages = TRUE respectively). If "none" no legend is plotted.

pages

Logical: Should all curves be plotted in a single plot (default: FALSE)? In addtion you could set legend = "onlyLast:<argument>" with <argument> as a character legend argument for only plotting a legend on the last plot of set.

natozero

Logical: Should NAs be coerced to zeros (default: TRUE)? Only has effect if rel = TRUE.

file

Character: File path if a pdf should be created

...

Additional graphical parameters

Value

A plot. Invisible: A dataframe with columns date and tnames with the counts/proportion of the selected topics.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
data(politics)
poliClean <- cleanTexts(politics)
words10 <- makeWordlist(text=poliClean$text)
words10 <- words10$words[words10$wordtable > 10]
poliLDA <- LDAprep(text=poliClean$text, vocab=words10)
LDAresult <- LDAgen(documents=poliLDA, K=10, vocab=words10)

# plot all topics
plotTopic(object=poliClean, ldaresult=LDAresult, ldaID=names(poliLDA))

# plot special topics
plotTopic(object=poliClean, ldaresult=LDAresult, ldaID=names(poliLDA), select=c(1,4))

## End(Not run)

tosca documentation built on Oct. 28, 2021, 5:07 p.m.