topics: Topic Coherence

Description Usage Arguments Details Examples

Description

Calculate topic coherence for topic models.

Usage

1

Arguments

models

A model, i.e.: LDA or LSI, or a list of the latter.

...

Any other options, from the official documentation.

Details

A greater coherence is preferred: a higher value on the get_coherence method, see example.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# preprocess the corpus
texts <- prepare_documents(corpus)
dictionary <- corpora_dictionary(texts)
corpus <- doc2bow(dictionary, texts)

# create 2 models to compare
good_lda_model <- model_lda(
  corpus = corpus, 
  id2word = dictionary, 
  iterations = 50L, 
  num_topics = 2L
)
bad_lda_model <- model_lda(
  corpus = corpus, 
  id2word = dictionary, 
  iterations = 1L, 
  num_topics = 5L
)

# create coherence models
good_cm <- model_coherence(
  model = good_lda_model, 
  corpus = corpus, 
  dictionary = dictionary, 
  coherence = 'u_mass'
)
bad_cm <- model_coherence(
  model = bad_lda_model, 
  corpus = corpus, 
  dictionary = dictionary, 
  coherence = 'u_mass'
)

# compare coherence
good_cm$get_coherence()
bad_cm$get_coherence()

news-r/gensimr documentation built on Jan. 9, 2021, 5:55 a.m.