FindTopicsNumber: FindTopicsNumber

Description Usage Arguments Value Examples

View source: R/main.R

Description

Calculates different metrics to estimate the most preferable number of topics for LDA model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
FindTopicsNumber(
  dtm,
  topics = seq(10, 40, by = 10),
  metrics = "Griffiths2004",
  method = "Gibbs",
  control = list(),
  mc.cores = NA,
  return_models = FALSE,
  verbose = FALSE,
  libpath = NULL
)

Arguments

dtm

An object of class "DocumentTermMatrix" with term-frequency weighting or an object coercible to a "simple_triplet_matrix" with integer entries.

topics

Vector with number of topics to compare different models.

metrics

String or vector of possible metrics: "Griffiths2004", "CaoJuan2009", "Arun2010", "Deveaud2014".

method

The method to be used for fitting; see LDA.

control

A named list of the control parameters for estimation or an object of class "LDAcontrol".

mc.cores

NA, integer or, cluster; the number of CPU cores to process models simultaneously. If an integer, create a cluster on the local machine. If a cluster, use but don't destroy it (allows multiple-node clusters). Defaults to NA, which triggers auto-detection of number of cores on the local machine.

return_models

Whether or not to return the model objects of class "LDA. Defaults to false. Setting to true requires the tibble package.

verbose

If false (default), suppress all warnings and additional information.

libpath

Path to R packages (use only if your R installation can't find 'topicmodels' package, [issue #3](https://github.com/nikita-moor/ldatuning/issues/3). For example: "C:/Program Files/R/R-2.15.2/library" (Windows), "/home/user/R/x86_64-pc-linux-gnu-library/3.2" (Linux)

Value

Data-frame with one or more metrics. numbers of topics and corresponding values of metric. Can be directly used by FindTopicsNumber_plot to draw a plot.

Examples

1
2
3
4
5
6
7
8
## Not run: 

library(topicmodels)
data("AssociatedPress", package="topicmodels")
dtm <- AssociatedPress[1:10, ]
FindTopicsNumber(dtm, topics = 2:10, metrics = "Arun2010", mc.cores = 1L)

## End(Not run)

ldatuning documentation built on April 21, 2020, 9:05 a.m.