predict_lda: Predict topics of tweets using fitted LDA model

View source: R/fit_lda.R

predict_ldaR Documentation

Predict topics of tweets using fitted LDA model

Description

Predict topics of tweets using fitted LDA model.

Usage

predict_lda(
  data,
  lda_model,
  response = "max",
  remove_numbers = TRUE,
  remove_punct = TRUE,
  remove_symbols = TRUE,
  remove_url = TRUE
)

Arguments

data

Data frame containing tweets and hashtags. Works with any data frame, as long as there is a "text" column of type character string and a "hashtags" column with comma separated character vectors. Can be obtained either by using load_tweets on a json object returned by Twitter's API v1.1 or by using stream_in on any json file, as long as it has a "text" and "hashtags" field. If you are unsure about the requirements you may load the sample piece of data contained in the package by following the example in the the example section of this help page.

lda_model

Fitted LDA Model. Object of class LDA.

response

Type of response. Either "prob" for probabilities or "max" one topic (default).

remove_numbers

Logical. If TRUE remove tokens that consist only of numbers, but not words that start with digits, e.g. 2day. See tokens.

remove_punct

Logical. If TRUE remove all characters in the Unicode "Punctuation" [P] class, with exceptions for those used as prefixes for valid social media tags if preserve_tags = TRUE. See tokens

remove_symbols

Logical. If TRUE remove all characters in the Unicode "Symbol" [S] class.

remove_url

Logical. If TRUE find and eliminate URLs beginning with http(s).

Value

Data frame of topic predictions or predicted probabilities per topic (see response).

Examples

## Not run: 

library(Twitmo)

# load tweets (included in package)
mytweets <- load_tweets(system.file("extdata", "tweets_20191027-141233.json", package = "Twitmo"))

# Pool tweets into longer pseudo-documents
pool <- pool_tweets(data = mytweets)
pooled_dfm <- pool$document_term_matrix

# fit your LDA model with 7 topics
model <- fit_lda(pooled_dfm, n_topics = 7, method = "Gibbs")

# Predict topics of tweets using your fitted LDA model
predict_lda(mytweets, model, response = "prob")

## End(Not run)


abuchmueller/Twitmo documentation built on Sept. 14, 2022, 8:06 p.m.