model_time_points: Fit multiple topic models to successive time intervals using...
In rturn/parseTweetFiles: Parse Tweet Files

View source: R/parseTweetFiles.R

model_time_points

R Documentation

Fit multiple topic models to successive time intervals using the maptpx package on a time marked Tweet data frame. The number of topics will be chosen independently for each interval via Bayes. The model created will be saved to a maptpx_model folder in the current directory, and the model is also visualized using the LDAvis package and saved into a maptpx_vis folder also in the current directory. Both of these folders need to be created before running the function.

Description

Fit multiple topic models to successive time intervals using the maptpx package on a time marked Tweet data frame. The number of topics will be chosen independently for each interval via Bayes. The model created will be saved to a maptpx_model folder in the current directory, and the model is also visualized using the LDAvis package and saved into a maptpx_vis folder also in the current directory. Both of these folders need to be created before running the function.

Usage

model_time_points(tweets.df, start.time, difference, num.steps, topic.min = 5,
  topic.max = 55, model.kill = 3)

Arguments

`tweets.df`	The dataframe of time marked tweets to fit models to. Should have a column "created_at" with times in the posixct format.
`start.time`	The first time point to start model fitting.
`difference`	The length of time for each time interval (in hours).
`num.steps`	The number of time intervals to fit models to.
`topic.min`	The smallest number of topics to consider for each model. Defaults to 5.
`topic.max`	The largest number of topics to consider for each model. Defaults to 55.
`model.kill`	The number of models with decreasing bayes factor fit before choosing the best model. A lower number tend to choose topic models with fewer topics, while a higher number may choose more topics. See the maptpx topics function package for more details. Defaults to 3.

Value

The number of topics chosen for each time interval

Examples

## Not run: time = as.POSIXct('2015-04-24 12:11', tz = "GMT")
## Not run: model_time_points(tweets.df, time, difference = 1.5, num.steps = 96, topic.min = 5,
topic.max = 10, model.kill = 4)
## End(Not run)

rturn/parseTweetFiles documentation built on July 31, 2023, 3:43 p.m.