model_lsi: Latent Semantic Indexing Model

Description Usage Arguments Details Examples

View source: R/models.R

Description

Transform into a latent n dimensional space via Latent Semantic Indexing.

Usage

1
2
3
model_lsi(corpus, distributed = FALSE, ...)

load_lsi(file)

Arguments

corpus

Corpus as returned by wrap. A tf-idf/bag-of-words transformation is recommended for LSI.

distributed

If TRUE - distributed mode (parallel execution on several machines) will be used.

...

Any other options, from the official documentation.

file

Path to a saved model.

Details

Target dimensionality (num_topics) of 200–500 is recommended as a “golden standard” https://dl.acm.org/citation.cfm?id=1458105.

Examples

1
2
3
4
5
6
7
docs <- prepare_documents(corpus)
dictionary <- corpora_dictionary(docs)
corpora <- doc2bow(dictionary, docs)

# fit model
lsi <- model_lsi(corpora, id2word = dictionary, num_topics = 2L)
lsi$print_topics()

news-r/gensimr documentation built on Jan. 9, 2021, 5:55 a.m.