model_time_points: Fit multiple topic models to successive time intervals using...

View source: R/parseTweetFiles.R

model_time_pointsR Documentation

Fit multiple topic models to successive time intervals using the maptpx package on a time marked Tweet data frame. The number of topics will be chosen independently for each interval via Bayes. The model created will be saved to a maptpx_model folder in the current directory, and the model is also visualized using the LDAvis package and saved into a maptpx_vis folder also in the current directory. Both of these folders need to be created before running the function.

Description

Fit multiple topic models to successive time intervals using the maptpx package on a time marked Tweet data frame. The number of topics will be chosen independently for each interval via Bayes. The model created will be saved to a maptpx_model folder in the current directory, and the model is also visualized using the LDAvis package and saved into a maptpx_vis folder also in the current directory. Both of these folders need to be created before running the function.

Usage

model_time_points(tweets.df, start.time, difference, num.steps, topic.min = 5,
  topic.max = 55, model.kill = 3)

Arguments

tweets.df

The dataframe of time marked tweets to fit models to. Should have a column "created_at" with times in the posixct format.

start.time

The first time point to start model fitting.

difference

The length of time for each time interval (in hours).

num.steps

The number of time intervals to fit models to.

topic.min

The smallest number of topics to consider for each model. Defaults to 5.

topic.max

The largest number of topics to consider for each model. Defaults to 55.

model.kill

The number of models with decreasing bayes factor fit before choosing the best model. A lower number tend to choose topic models with fewer topics, while a higher number may choose more topics. See the maptpx topics function package for more details. Defaults to 3.

Value

The number of topics chosen for each time interval

Examples

## Not run: time = as.POSIXct('2015-04-24 12:11', tz = "GMT")
## Not run: model_time_points(tweets.df, time, difference = 1.5, num.steps = 96, topic.min = 5,
topic.max = 10, model.kill = 4)
## End(Not run)

rturn/parseTweetFiles documentation built on July 31, 2023, 3:43 p.m.