fit_stm: Fit STM (Structural topic model)
In abuchmueller/Twitmo: Twitter Topic Modeling and Visualization for R

fit_stm

R Documentation

Fit STM (Structural topic model)

Description

Estimate a structural topic model

Usage

fit_stm(
  data,
  n_topics = 2L,
  xcov,
  remove_punct = TRUE,
  stem = TRUE,
  remove_url = TRUE,
  remove_emojis = TRUE,
  stopwords = "en",
  ...
)

Arguments

`data`	Data frame containing tweets and hashtags. Works with any data frame, as long as there is a "text" column of type character string and a "hashtags" column with comma separated character vectors. Can be obtained either by using `load_tweets` on a json object returned by Twitter's API v1.1 or by using `stream_in` on any json file, as long as it has a "text" and "hashtags" field. If you are unsure about the requirements you may load the sample piece of data contained in the package by following the example in the the example section of this help page.
`n_topics`	Integer with number of topics.
`xcov`	Either a \[stats]formula with an empty left-hand side specifying external covariates (meta data) to use.e.g. `~favourites_count + retweet_count` or a character vector (`c("favourites_count", "retweet_count")`) or comma separated character string (`"favourites_count,retweet_count"`) with column names implying which metadata to use as external covariates.
`remove_punct`	Logical. Indicates whether punctuation (includes Twitter hashtags and usernames) should be removed. Defaults to TRUE.
`stem`	Logical. If `TRUE` turn on word stemming for terms.
`remove_url`	Logical. If `TRUE` find and eliminate URLs beginning with http(s).
`remove_emojis`	Logical. If `TRUE` all emojis will be removed from tweets.
`stopwords`	a character vector, list of character vectors, dictionary or collocations object. See pattern for details. Defaults to stopwords("english").
`...`	Additional arguments passed to stm.

Details

Use this to function estimate a STM from a data frame of parsed Tweets. Works with unpooled Tweets only. Pre-processing and fitting is done in one run.

Value

Object of class stm. Additionally, pre-processed documents are appended into a named list called "prep".

Examples

## Not run: 

library(Twitmo)

# load tweets (included in package)
mytweets <- load_tweets(system.file("extdata", "tweets_20191027-141233.json", package = "Twitmo"))

# fit STM with tweets
stm_model <- fit_stm(mytweets,
  n_topics = 7,
  xcov = ~ retweet_count + followers_count + reply_count +
    quote_count + favorite_count,
  remove_punct = TRUE,
  remove_url = TRUE,
  remove_emojis = TRUE,
  stem = TRUE,
  stopwords = "en"
)

## End(Not run)

abuchmueller/Twitmo documentation built on Sept. 14, 2022, 8:06 p.m.