similarity: Similarity

Description Usage Arguments

View source: R/similarity.R

Description

Splits the index into several smaller sub-indexes (“shards”), which are disk-based. If your entire index fits in memory (~one million documents per 1GB of RAM), you can also use the similarity_matrix. It is more simple but does not scale as well: it keeps the entire index in RAM, no sharding. It also do not support adding new document to the index dynamically.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
similarity(corpus, ...)

## S3 method for class 'gensim.corpora.mmcorpus.MmCorpus'
similarity(corpus, num_features,
  ...)

## S3 method for class 'mm_file'
similarity(corpus, num_features, ...)

## S3 method for class 'python.builtin.tuple'
similarity(corpus, num_features, ...)

Arguments

corpus

A corpus.

...

Any other parameters to pass to the Python function, see official documentation.

num_features

Size of the dictionary i.e.:reticulate::py_len(dictionary).


news-r/gensimr documentation built on Jan. 9, 2021, 5:55 a.m.