gensim: Utilities to interface to gensim.

Description Usage Arguments Details Author(s) Examples

Description

Utilities to interface to gensim.

Usage

1
2
3
4
5
gensim_ldamodel_as_LDA_Gibbs(model, dtm)

gensim_ldamodel_load(modeldir, modelname)

dtm_as_bow(dtm)

Arguments

model

An LDA model trained by gensim.

dtm

A Document-Term-Matrix (will be turned into BOW data structure).

modeldir

Directory where a gensim LDA topic model has been saved.

modelname

Name of a gensim LDA topic model. The data for a model consists of a set of files starting with the modelname each.

Details

The gensim_ldamodel_as_LDA_Gibbs()-function turns a gensim/Python model (that may have been loaded using gensim_ldamodel_load) into the class LDA_Gibbs well-known from the topicmodels package for further processing within R.

Use gensim_ldamodel_load to load an ldamodel computed by gensim. The return value is a LdaModel Python object that can serve as input to functions or that can be processed using the reticulate package.

The input to gensim's LDA modelling methods is a representation of corpora in a data format denoted as "BOW". This utility function dtm_as_bow turns a sparse matrix (class simple_triplet_matrix) into the bow input format required by gensim.

Author(s)

Andreas Blaette

Andreas Blaette

Andreas Blaette

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Not run: 
if (requireNamespace("reticulate")){
  gensim <- reticulate::import("gensim")

  modeldir <- system.file(package = "topicanalysis", "extdata", "gensim")
  modelname <- "germaparlmini"
  
  dtm <- readRDS(
    file = system.file(
      package = "topicanalysis", "extdata", "gensim", "germaparlmini_dtm.rds"
    )
  )
  
  lda <- gensim_ldamodel_load(
    modeldir = system.file(package = "topicanalysis", "extdata", "gensim"),
    modelname = "germaparlmini"
  )

  y <- gensim_ldamodel_as_LDA_Gibbs(model = lda, dtm = dtm)
  topics_terms <- topicmodels::get_terms(y, 10)
  docs_topics <- topicmodels::get_topics(y, 5)
}

## End(Not run)

PolMine/polmineR.topics documentation built on March 6, 2020, 6:03 p.m.