predict_lda: Predict topics of tweets using fitted LDA model
In abuchmueller/Twitmo: Twitter Topic Modeling and Visualization for R

predict_lda

R Documentation

Predict topics of tweets using fitted LDA model

Description

Predict topics of tweets using fitted LDA model.

Usage

predict_lda(
  data,
  lda_model,
  response = "max",
  remove_numbers = TRUE,
  remove_punct = TRUE,
  remove_symbols = TRUE,
  remove_url = TRUE
)

Arguments

`data`	Data frame containing tweets and hashtags. Works with any data frame, as long as there is a "text" column of type character string and a "hashtags" column with comma separated character vectors. Can be obtained either by using `load_tweets` on a json object returned by Twitter's API v1.1 or by using `stream_in` on any json file, as long as it has a "text" and "hashtags" field. If you are unsure about the requirements you may load the sample piece of data contained in the package by following the example in the the example section of this help page.
`lda_model`	Fitted LDA Model. Object of class LDA.
`response`	Type of response. Either "prob" for probabilities or "max" one topic (default).
`remove_numbers`	Logical. If `TRUE` remove tokens that consist only of numbers, but not words that start with digits, e.g. 2day. See tokens.
`remove_punct`	Logical. If `TRUE` remove all characters in the Unicode "Punctuation" [P] class, with exceptions for those used as prefixes for valid social media tags if `preserve_tags = TRUE`. See tokens
`remove_symbols`	Logical. If `TRUE` remove all characters in the Unicode "Symbol" [S] class.
`remove_url`	Logical. If `TRUE` find and eliminate URLs beginning with http(s).

Value

Data frame of topic predictions or predicted probabilities per topic (see response).

Examples

## Not run: 

library(Twitmo)

# load tweets (included in package)
mytweets <- load_tweets(system.file("extdata", "tweets_20191027-141233.json", package = "Twitmo"))

# Pool tweets into longer pseudo-documents
pool <- pool_tweets(data = mytweets)
pooled_dfm <- pool$document_term_matrix

# fit your LDA model with 7 topics
model <- fit_lda(pooled_dfm, n_topics = 7, method = "Gibbs")

# Predict topics of tweets using your fitted LDA model
predict_lda(mytweets, model, response = "prob")

## End(Not run)

abuchmueller/Twitmo documentation built on Sept. 14, 2022, 8:06 p.m.